Idea List

Open ideas, if you decide to build any of these let me know!

CLI tool that takes an image and sticks it on the web as a permalink. Currently copy images into GitHub and steal the link that it generates.
Reresume: upload a resume and a job description link and get a rewritten resume tailored to the job.
Sweresume: automatic LaTeX resumes for software engineers following a standard template. 1
Convert research papers to running code automatically, or at least a repo scaffold.
Match post writers across the web using stylometry.
Receipt extraction for splitting meals and groceries. Just take a picture of what you ate or purchased and have an LLM do the math. Could be possible when the GPT-V API drops. Fuyu-8B might work as well.
Build a knowledge graph of code blocks with an entire repository. There can be an agent that can operate on this graph asynchronously. I believe Cursor is working on something like this.
Interview prep using ChatGPT + voice mode.
Write a blog post evangelizing VS Code.
Embed your entire Twitter/Discord/etc and make connections from your likes, followers, etc.
Display some of the best pieces of writing on the Internet in a common place, with beautiful styling. For example all of the Paul Graham essays, some Twitter posts, Ted Chiang essays like Understand that only exist on old web archives.
Dscan: scan through your data and “swipe left/right” on good/bad examples to manually create a training set. 1, 2
Convert images to LaTeX, could be useful for converting complicated graphs or diagrams.
Automatically teach yourself using short form videos like the subway surfer TikToks. Can query content from anywhere - reddit, twitter etc. Need to make a more general data querying tool that I can use for things.
Summarize any subreddit with LLMs by querying the RSS.
Rebuild popular products like Postman as free, open-source tools.
An AR tool that lets you query references you like to make. Imagining something like Family Guy cutaways, effectively querying your memories.
A way to modify a calendar state by directly altering the JSON version with an LLM.
Embed solutions to LeetCode problems to compare similarity.
A language model to translate mixed languages like Chinglish or Hinglish.
AB testing for agentic web browsing.
Automatically create YouTube shorts/TikToks from subreddits.
Solve the issue where you have a ton of Jupyter notebooks all with slightly different variations of the same pre-processing code.
A little guy who organizes and cleans up your codebase while you’re not using your computer.
Link entire codebases together graphically and automatically and allow users to parse through the function calls in a spatial UI – “spatial coding”.
Map coding error messages to their solutions to build up a repository of personal debugging tricks automatically.
- A model could use this data for its own exploration when its stuck, so it learns on past stacktraces + Google search.
GPT vision tools:
- Better autonomous web browsing
- Receipt extraction
- Math homework helper
- Image to LaTeX converter
- Calorie counter by converting image to ingredients + use RAG to compare against nutrition facts
- Recipe generator from ingredients
- Diagram generator from whiteboard
- Excel analyzer
- Detect fake vs. real sneakers
- Structure-anything: convert anything to structured data
- An “anything API” since it can see the whole web
- Something that replaces Selenium/Playwright for web scraping
- Solve web accessibility
- Analyze screenshots of audio waves
Actually good dev-tools for things like base64 conversion, OpenAI token counting, etc. Good for LLM devs and regular devs alike.
Better interface for constant web-scraping using this.
Convert an MIT course into useful lecture notes. Should add a sliding scale that lets you modify the difficulty, can do this by pre-generating content for different education levels. 1
Compressing thought and expanding it again is like noising and denoising in diffusion models.
Continuously summarize HN, Reddit, etc. and generate podcast episodes, similar to ScribePod.
Better way to map things you like in a city, Felt is doing a great job at this.
Generate color palette suggestions from aesthetic images.
Nice view components for Spotify and other apps to make it easy to drop on your website.
Chrome extension for bionic reading.
BeReal memories downloader.
Notion page for all LeetCode problems with better solutions.
Predict stable diffusion prompts from the images, can train on DiffusionDB.
LeetCode but for debugging, currently no way to practice scenario-specific debugging. Solved: 1
Stable diffusion + Recaptcha.
Twitter except it’s just posts your friends have liked.
An easy way to analyze scrapes of your own liked tweets in embedding space, something better than grep.
Website to help develop intuition on hard math/CS concepts, best quality explainers.
Watermarking handwriting/typed text algorithmically, something like what Scott Aaronson worked on at OpenAI.
Grammarly + GPT.
Teach students things on TikTok by distilling MIT OCW into short form clips for coding and such.
Hack viral short form marketing for any product launch.
Train a model to chunk a long passage into human friendly sections. 1
Convert git commit history to a changelog. It could be something you add to a repo (GitHub action?) that keeps the README updated for instance.
Take any data and convert it into a knowledge graph, with a simple API that makes it useful for other projects.
Build a realtime navigation tool for blind people into an iPhone app. 1
Visualize information as embeddings, like taking a textbook and converting it into a cloud graph. 1, 2
Segment a webpage → embed it → select pieces based on a nearest neighbor search of the embeddings/query. Requires contrastive data for a web snippet/text, like CLIP.
Try using CogVLM to break Captchas. 1
Build a text editor that periodically takes snapshots of the text area and sends them to GPT for autocorrect, an auto editing writing editor basically.
Finetune an LLM on your own tweets.
Rebuild this but with GPT-V.
Route to any model, cloud-based or local, with ease.
Visualize loss curves in 3D with some cool interface.

Ishan's Cafe

Recent Notes

Startups to Follow

Tensor Puzzles

Bit Hacks

Explorer

Idea List

Graph View

Backlinks