Overview
magicClicker() is a powerful vision-based automation function that allows you to click on UI elements by describing them in natural language. It uses the Moondream API to analyze screenshots and locate elements on screen.
Function Signature
Parameters
Natural language description of the UI element you want to click. Be specific about the element’s appearance and location.Examples:
- “Profile button in bottom right corner”
- “Search icon at the top”
- “Red notification bell”
- “Submit button”
How It Works
- Screenshot Capture: Takes a screenshot of the current screen using the ScreenCaptureService
- Image Processing: Converts the screenshot to base64-encoded JPEG format
- Vision AI Analysis: Sends the image and description to Moondream API’s
/v1/pointendpoint - Coordinate Detection: Receives normalized coordinates (x, y) of the target element
- Click Execution: Converts normalized coordinates to pixel coordinates and performs the click using AccessibilityService
Code Examples
Basic Usage
Real-World Examples
Email Automation Example
Vision API Details
Moondream API Integration
The function uses Moondream’s point detection API: Endpoint:https://api.moondream.ai/v1/point
Request Format:
Coordinate Transformation
The API returns normalized coordinates (0.0 to 1.0). These are converted to screen pixels:Best Practices
Writing Good Descriptions
Good descriptions:- “Blue send button in bottom right”
- “Profile icon with circular avatar”
- “Red close X button at top”
- “button” (too vague)
- “the thing” (not descriptive)
- “click here” (no visual information)
Error Handling
The function includes built-in error handling:- No Screenshot Available: Returns early with voice feedback
- Element Not Found: Moondream returns null, user is notified via speech
- Activity Destroyed: Safely exits if the activity is no longer active
- API Errors: Logged to console with error messages
Performance Considerations
- Asynchronous Execution: Runs in a coroutine to avoid blocking the UI thread
- Image Compression: Screenshots are compressed to 85% JPEG quality for faster transmission
- Timeout: Network requests have 30-second timeout limits
- Memory Management: Bitmaps are properly recycled after use
Tracking and Analytics
Each magic click is tracked to Firebase for analytics:Related Functions
magicScraper()- Extract text/data from screen using natural languagesimulateClick()- Direct coordinate-based clicking without vision AIlaunchGmail()- Launch specific apps before using magicClicker