ai mocap and tracking systems


 
we want to use live markeless mocap, either for audience interaction, performers or both, the reason for markerless is ease of use, participants dont have to wear mocap suits. the first system we looked at is the captury and we have a great hands on demo day with target3D  in their london studio.  here is the video from that day:




the  system uses either windows with 8x optitrack cameras or  with linux and 8x FLIR cameras.  the system can track 4 people if you have a good  space. the  issue with it is, it’s really expensive. the next most similar system is newcomer moveai -this uses a similar camera array setup, however we have not had the opportunity to try this.



finally we discover new kid on the block radical, this  system uses a single camera to send  video up to a web-based ai and sends back skeletal data.  the web space can hold up to ten performers each dialling in on a separate  pc or phone. The best thing, its really inexpensive.  it had some issues with foot sliding & latency but hey what do you expect for 100 dollars a year. they are updating it frequently - and in fact on 07/03/25 fixed the footsliding issue with version 4 in livemode! radical did a case study of nino here.

heres a short video of testing radical:

in the video, the RADiCAL website getting the live video from the webcam is on the left, below is the RADiCAL avatar showing the mocap, on the right is the mocap data streamed into Unreal, animating 3 avatars. you can see there is a lag between person moving and the Unreal avatar moving. this is mainly to do with internet upload speeds (I think). However creates a quite nice feedback where you copy the avatars movement, -- which is your movement. the single camera copes well with a bad lighting setup too. over the course of the residency the developers continue work on the software implementing better control over foot placement to reduce sliding. (the other great feature of the system is that it fails gracefully - when the tracked subject leaves the volume, the avatar doesn’t do anything weird - instead freezing in a nice position.)




another challenger in the arena of ai mocap is remocapp --this system can use up to 8 cameras, supports face and fingers and has an unreal plugin, and is not expensive. We are testing this, one of the immediate issues is that the calibration process for each camera seems a little tricksy. we are testing with two cameras, the data is processed locally with internet connection only needed for licence checking. initially the skeleton, which seems smooth inside  the remocapp software, appears to misbehave when streaming to unreal 5. after contacting the developers we learn to separate the 2 cameras to be at least 5m apart which gives a much better result.

heres a short vid of testing remocapp

on the left is the software, I have 2x webcams setup. on the right is the data streaming into unreal. the software uses a local AI model rather than streaming to the web and can support up to 8 cameras. I could not get a great result with it and I tried several times with different configs. the advice from developer is to keep cams at least 5m apart and at head height. there is a calibration marker for the cameras and a slightly tricksy calibration process. I had high hopes but eventually preferred RADiCALS ease of use and graceful failures.


Rehearsing with RADiCAL and dancers at SIN in Budapest - we’ve set up 2 devices captuting and streaming into Unreal eventually we get three at once.




another osc system we try is posenetosc - this is made with openpose. naoto from modina makes an .exe for this 2d opensource skeleton detection and tracking system.  its quick , web-based and we make a widget in unreal to smooth incoming data and use it to ‘scratch’ a prerecorded animation sequence, controlled by a joint position from a user in moving of the camera, using the method  below.

Open pose detecting 2 performers. This is quick and the segmentation (what limb is what) is good but of course its only 2D




ai camera


testing an ai camera with dancers, we got this 4k ptz camera from obsbot: initially because its possible to send it commands via osc - we considered making the camera behave like an automated follow spot and using mocap data as in input. It tured out that it has an AI feature that can track humans and it sort of behaves like a cameraperson when its in a larger space, it can follow movement or isolate the lower part of the body and gives pleasing results.