If you have a bunch of surveillance footage, the bottleneck is your analysts' ability to comb through it. You can sit LLMs on top of faster object detection/identification algorithms to create narratives across your surveillance net that are easy to query, can be overlaid on timelines, etc.
That's fair, but I think it's a significant step beyond the queries jrochkind1 was describing. (I also don't trust LLMs to do it accurately but maybe that part will change.)