Resumen
Millions of people around the world face daily challenges due to visual impairments, particularly in navigation and object recognition. While existing tools like canes and guide dogs offer partial support, they often lack real-time context
awareness and are inaccessible to many due to high costs or limited
availability. This paper presents Vogui, a low-cost, user-centric
assistive technology system that integrates an ESP32-CAM module with cloud-based artificial intelligence to deliver real-time
auditory feedback. The system captures environmental images,
sends them to AI servers for object and pattern recognition,
translates the scene into natural language descriptions, and
communicates this information to users through synthesized
speech. Extensive field testing was conducted in indoor and
outdoor settings under various lighting and network conditions
(3G/4G). The prototype achieved an average response latency of
1.5 seconds (4G), a recognition accuracy of 90%, and maintained
stable connectivity with minimal power consumption.
Comparisons against smartphone-based alternatives highlight
Vogui’s superior specialization, autonomy, and affordability.
These findings demonstrate the feasibility and effectiveness of
Vogui in promoting independent mobility and social inclusion for
the visually impaired, especially in resource-constrained
environments.
awareness and are inaccessible to many due to high costs or limited
availability. This paper presents Vogui, a low-cost, user-centric
assistive technology system that integrates an ESP32-CAM module with cloud-based artificial intelligence to deliver real-time
auditory feedback. The system captures environmental images,
sends them to AI servers for object and pattern recognition,
translates the scene into natural language descriptions, and
communicates this information to users through synthesized
speech. Extensive field testing was conducted in indoor and
outdoor settings under various lighting and network conditions
(3G/4G). The prototype achieved an average response latency of
1.5 seconds (4G), a recognition accuracy of 90%, and maintained
stable connectivity with minimal power consumption.
Comparisons against smartphone-based alternatives highlight
Vogui’s superior specialization, autonomy, and affordability.
These findings demonstrate the feasibility and effectiveness of
Vogui in promoting independent mobility and social inclusion for
the visually impaired, especially in resource-constrained
environments.
| Idioma original | Español (Perú) |
|---|---|
| Páginas | 0 |
| - | 10 |
| Estado | Indizado - 1 mar. 2026 |
| Evento | 12th International Conference on Control, Mechatronics and Automation, ICCMA 2024 - London, Reino Unido Duración: 11 nov. 2024 → 13 nov. 2024 |
Conferencia
| Conferencia | 12th International Conference on Control, Mechatronics and Automation, ICCMA 2024 |
|---|---|
| País/Territorio | Reino Unido |
| Ciudad | London |
| Período | 11/11/24 → 13/11/24 |
Palabras clave
- Assistive technology
- Text-to-Speech
- Human- Computer Interaction