I think you're describing technology that has existed for 15+ years and is already pretty accurate. It's not even necessarily "AI"/ML. For example, I think OpenALPR (automated license plate recognition) is all "classical" computer vision. The most accurate facial/gait/etc. recognition is most likely ML-based with a state-of-the-art model, admittedly, and perhaps the threshold of accuracy for large-scale usefulness was only crossed recently.
The guard rails IMHO are not technological but who owns the cameras/video storage backend, when/if a warrant is needed, and the criteria for granting one.
Can you explain what you mean? The queries in jrochkind1 are not something I'd expect AI (LLMs, I assume) to be necessary for. Too simple and factual. (Maybe just the last one would be where interpretation kicks in—knowing what to emphasize in a summary, describing actions.) Did you have something else in mind?
If you have a bunch of surveillance footage, the bottleneck is your analysts' ability to comb through it. You can sit LLMs on top of faster object detection/identification algorithms to create narratives across your surveillance net that are easy to query, can be overlaid on timelines, etc.
That's fair, but I think it's a significant step beyond the queries jrochkind1 was describing. (I also don't trust LLMs to do it accurately but maybe that part will change.)
The guard rails IMHO are not technological but who owns the cameras/video storage backend, when/if a warrant is needed, and the criteria for granting one.