ClickVision: Smart Video with Real-Time  Object  Linking | IJCT Volume 12 – Issue 6 | IJCT-V12I6P55

International Journal of Computer Techniques
ISSN 2394-2231
Volume 12, Issue 6  |  Published: November – December 2025

Author

MADHU C K, JAHNAVI H K, PAVAN PATEL N, PAVANA B N, THILAK RAJ P

Abstract

By converting traditional video viewing into an interactive, context-aware experience, ClickVision is a cutting-edge web- based intelligent multimedia framework. Users can interact dynamically with objects, logos, and visual markers that appear within video frames thanks to the system’s integration of real-time computer vision and machine learning. ClickVision is able to detect objects in real time within the browser by using TensorFlow.js with the COCO-SSD model on the client side. This provides low latency, device independence, and improved privacy by avoiding server- side video uploads. Intelligent linking and similarity-based retrieval algorithms are used in a Flask-based backend to semantically map detected objects to pertinent external resources, including e-commerce product pages, educational materials, and informational databases. Using asynchronous communication and caching techniques (through Redis), this architecture reduces computational over- head while facilitating smooth object- level interaction. ClickVision uses a Federated Learning (FL) framework with Differential Privacy (DP) to improve model adaptability and user privacy. Without having access to raw data, this allows the system to learn cooperatively from dispersed clients, guaranteeing that personalization and ongoing model improve- ment take place safely across devices. According to experimental findings, the suggested FedAvg + DP model maintains near-centralized performance while protecting data privacy, achieving 92.6% accuracy, 91.8% precision, and 93.5% recall. The platform is appropriate for use cases in e- commerce, education, advertising, and entertainment due to its modular design, which facilitates real-time scalability, cross-device deployment, and contextual engagement. ClickVision presents a paradigm shift toward next-generation smart multimedia systems by bridging the gap between passive video consump- tion and interactive exploration. This allows users to discover content dynamically and interact with it in an intuitive, context-driven manner.

Keywords

TensorFlow.js, COCO-SSD, Flask, federated learning, differential privacy, client-side machine learning, mul- timedia engagement, context-aware video, intelligent interactivity, real-time object detection, and user experience.

Conclusion

The suggested ClickVision framework introduces a clever, interactive, and context-aware multimedia platform that transforms the traditional method of watching videos. Click- Vision turns static videos into dynamic experiences that enable users to explore, learn, and interact directly with on-screen content by combining real-time object detection with web-based interactivity. Flask for backend intelligence and TensorFlow.js for client-side inference guarantee a fair trade-off between security, scalability, and speed. While the server offers contextual understanding through semantic matching and API integration, the client handles computa- tionally demanding tasks locally, reducing latency and im- proving privacy. Effective load distribution is made possible by this dual-layer architecture, which also greatly increases the system’s adaptability to diverse environments. Experi- ments on various devices validated the system’s robust per- formance metrics, which included maintaining responsive- ness on smartphones and tablets with frame rates above 25 FPS and attaining over 94% accuracy on desktop devices. Users were able to receive real-time object recognition feed- back without visual lag because the latency stayed within a human-interactive threshold. From an application perspec- tive, ClickVision has enormous potential in a wide range of fields: •In e-commerce, customers can click on products they recognize in a video to view more information or make an instant purchase. •In the context of education, students can engage with visual content and instantly access relevant resources, definitions, and tutorials. •Brands can use interactive visual engagement to create personalized, immersive advertising campaigns. •By providing contextual descriptions of detected ob- jects, the system can assist users who are blind visu- ally.

References

[1]J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, real-time object detec- tion,” Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 779–788. [2]W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “SSD: Single Shot Multi- Box Detector,” Proc. European Conference on Com- puter Vision (ECCV), Amsterdam, Netherlands, 2016, pp. 21–37. [3]R. Girshick, “Fast R-CNN,” Proc. IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015, pp. 1440–1448. [4]S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region pro- posal networks,” IEEE Transactions on Pattern Anal- ysis and Machine Intelligence (TPAMI), vol. 39, no. 6, pp. 1137–1149, Jun. 2017. [5]A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft, “Simple Online and Realtime Tracking,” Proc. IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 2016, pp. 3464–3468. [6]L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi, and P. H. S. Torr, “Fully-Convolutional Siamese Net- works for Object Tracking,” Proc. European Confer- ence on Computer Vision (ECCV) Workshops, Ams- terdam, Netherlands, 2016, pp. 850–865. [7]F. Ning, J. Delmerico, D. Scaramuzza, and J. Xiao, “Real- Time Semantic Object Detection and Tracking for Interactive Video Applications,” IEEE Access, vol. 8, pp. 138542–138554, 2020.

How to Cite This Paper

MADHU C K, JAHNAVI H K, PAVAN PATEL N, PAVANA B N, THILAK RAJ P (2025). ClickVision: Smart Video with Real-Time Object Linking. International Journal of Computer Techniques, 12(6). ISSN: 2394-2231.

© 2025 International Journal of Computer Techniques (IJCT). All rights reserved.