Open ideas, if you decide to build any of these let me know!
- CLI tool that takes an image and sticks it on the web as a permalink. Currently copy images into GitHub and steal the link that it generates.
- Reresume: upload a resume and a job description link and get a rewritten resume tailored to the job.
- Sweresume: automatic LaTeX resumes for software engineers following a standard template. 1
- Convert research papers to running code automatically, or at least a repo scaffold.
- Match post writers across the web using stylometry.
- Receipt extraction for splitting meals and groceries. Just take a picture of what you ate or purchased and have an LLM do the math. Could be possible when the GPT-V API drops. Fuyu-8B might work as well.
- Build a knowledge graph of code blocks with an entire repository. There can be an agent that can operate on this graph asynchronously. I believe Cursor is working on something like this.
- Interview prep using ChatGPT + voice mode.
- Write a blog post evangelizing VS Code.
- Embed your entire Twitter/Discord/etc and make connections from your likes, followers, etc.
- Display some of the best pieces of writing on the Internet in a common place, with beautiful styling. For example all of the Paul Graham essays, some Twitter posts, Ted Chiang essays like Understand that only exist on old web archives.
- Dscan: scan through your data and “swipe left/right” on good/bad examples to manually create a training set. 1, 2
- Convert images to LaTeX, could be useful for converting complicated graphs or diagrams.
- Automatically teach yourself using short form videos like the subway surfer TikToks. Can query content from anywhere - reddit, twitter etc. Need to make a more general data querying tool that I can use for things.
- Summarize any subreddit with LLMs by querying the RSS.
- Rebuild popular products like Postman as free, open-source tools.
- An AR tool that lets you query references you like to make. Imagining something like Family Guy cutaways, effectively querying your memories.
- A way to modify a calendar state by directly altering the JSON version with an LLM.
- Embed solutions to LeetCode problems to compare similarity.
- A language model to translate mixed languages like Chinglish or Hinglish.
- AB testing for agentic web browsing.
- Automatically create YouTube shorts/TikToks from subreddits.
- Solve the issue where you have a ton of Jupyter notebooks all with slightly different variations of the same pre-processing code.
- A little guy who organizes and cleans up your codebase while you’re not using your computer.
- Link entire codebases together graphically and automatically and allow users to parse through the function calls in a spatial UI – “spatial coding”.
- Map coding error messages to their solutions to build up a repository of personal debugging tricks automatically.
- A model could use this data for its own exploration when its stuck, so it learns on past stacktraces + Google search.
- GPT vision tools:
- Better autonomous web browsing
- Receipt extraction
- Math homework helper
- Image to LaTeX converter
- Calorie counter by converting image to ingredients + use RAG to compare against nutrition facts
- Recipe generator from ingredients
- Diagram generator from whiteboard
- Excel analyzer
- Detect fake vs. real sneakers
- Structure-anything: convert anything to structured data
- An “anything API” since it can see the whole web
- Something that replaces Selenium/Playwright for web scraping
- Solve web accessibility
- Analyze screenshots of audio waves
- Actually good dev-tools for things like base64 conversion, OpenAI token counting, etc. Good for LLM devs and regular devs alike.
- Better interface for constant web-scraping using this.
- Convert an MIT course into useful lecture notes. Should add a sliding scale that lets you modify the difficulty, can do this by pre-generating content for different education levels. 1
- Compressing thought and expanding it again is like noising and denoising in diffusion models.
- Continuously summarize HN, Reddit, etc. and generate podcast episodes, similar to ScribePod.
- Better way to map things you like in a city, Felt is doing a great job at this.
- Generate color palette suggestions from aesthetic images.
- Nice view components for Spotify and other apps to make it easy to drop on your website.
- Chrome extension for bionic reading.
- BeReal memories downloader.
- Notion page for all LeetCode problems with better solutions.
- Predict stable diffusion prompts from the images, can train on DiffusionDB.
- LeetCode but for debugging, currently no way to practice scenario-specific debugging. Solved: 1
- Stable diffusion + Recaptcha.
- Twitter except it’s just posts your friends have liked.
- An easy way to analyze scrapes of your own liked tweets in embedding space, something better than grep.
- Website to help develop intuition on hard math/CS concepts, best quality explainers.
- Watermarking handwriting/typed text algorithmically, something like what Scott Aaronson worked on at OpenAI.
- Grammarly + GPT.
- Teach students things on TikTok by distilling MIT OCW into short form clips for coding and such.
- Hack viral short form marketing for any product launch.
- Train a model to chunk a long passage into human friendly sections. 1
- Convert git commit history to a changelog. It could be something you add to a repo (GitHub action?) that keeps the README updated for instance.
- Take any data and convert it into a knowledge graph, with a simple API that makes it useful for other projects.
- Build a realtime navigation tool for blind people into an iPhone app. 1
- Visualize information as embeddings, like taking a textbook and converting it into a cloud graph. 1, 2
- Segment a webpage → embed it → select pieces based on a nearest neighbor search of the embeddings/query. Requires contrastive data for a web snippet/text, like CLIP.
- Try using CogVLM to break Captchas. 1
- Build a text editor that periodically takes snapshots of the text area and sends them to GPT for autocorrect, an auto editing writing editor basically.
- Finetune an LLM on your own tweets.
- Rebuild this but with GPT-V.
- Route to any model, cloud-based or local, with ease.
- Visualize loss curves in 3D with some cool interface.