Code wikis are documentation theater as a service

38 points by theletterf 2 days ago

xrd 2 days ago

This is terrific writing and what we lose when we pretend AI can do terrific writing.

The biggest problem we face right now is that the large majority of people are terrible writers and can't recognize why this is awful. It really felt like the moment before chatgpt arrived we were coming into a new world where the craft of writing was surging in popularity and making a difference. That all feels lost.

This kind of post makes me have hope.

theletterf a day ago

Thank you. I'm glad it makes you feel this way.
- nrhrjrjrjtntbt a day ago
  
  TIL: ersatz!
bsder a day ago

> The biggest problem we face right now is that the large majority of people are terrible writers and can't recognize why this is awful.
PowerPoint created a very similar Eternal September.
After PowerPoint got done applying lipstick to the pig, your incoherent garbage was presented on screen with neat fonts and bullet points, so I couldn't immediately judge your lack of preparedness. I had to wade through 10+ minutes of your slop before realizing you were wholly unprepared.
AI is just extending the amount of time I have to waste before realizing that what I'm looking at is slop. I'm not happy about this.

sorobahn a day ago

I’m actually interested in solving the documentation problem. Imo we as engineers are thinking too small and keeping docs as this side thing sounds like a recipe for irregular maintenance. Instead, what if docs were more like live blueprints of running systems? We don’t want obvious stuff documented like there is a function called foo, but foo’s relationship to other parts of the code and its runtime characteristics seem important. I think I’m imagining a different form of documentation that is tied with observability but that’s because I feel it’s information that’s far away from code currently and ideally I’d like all information derived from a piece of code to be available at the same place.

Probably slightly off topic but I’d be curious to hear what other people want out of automated systems in this space. I have so many half baked ideas and would love to hear what’s others think/want.

theletterf a day ago

My suggestion on how to solve this: https://passo.uno/from-tech-writers-to-ai-context-curators/

cxr 2 days ago

If you're upset about these things, the one thing not to do is to empower them by yes-anding the people involved as they debase the meaning of words, in the way the author of this article does:

> I’ve tried it on one of my pet projects and it produced an entire wiki full of dev docs

Did it? No, it didn't. "Wiki" is not a synonym for "project documentation". (You could set up a wiki to manage the documentation for your project. But that's not what any of these things are about.)

These aren't wikis.

theletterf 2 days ago

You're right. I wrapped "wiki" in quotation marks in the post. Thank you for reminding me. I also added a callout.

cryzinger 2 days ago

Even without introducing LLMs into the equation, I've been brought on as the technical writer for many projects where the team says "oh, we already have a readme, you just need to clean it up" and then all of the readme definitions for parameters or settings or whatever are like:

    brickLock: The lock of the brick.
    brickDrink: The drink of the brick.
    brickWink: The wink of the brick.

...which is to say, definitions that just restate whatever's evident from the code or variable names themselves, and that make sense if you're already familiar with the thing being defined, but don't actually explain their purpose or provide context for how to use them (in other words, the main reasons to have documentation).

My role as a writer is then to (1) extract net-new information out of the team, (2) figure out how all of that new info fits together, (3) figure out the implications of that info for readers/users, and then (4) assemble it in an attractive manner.

An autogenerated code wiki (or a lazy human) can presumably do the fourth step, but it can't do the first three steps preceding it, and without those preceding steps you're just rearranging known data. There are times where that can be helpful, but it's more often just gloss

Neywiny a day ago

This is what I wanted to focus on so thanks for starting the convo. This all feels like 100% coverage = perfectly tested, no bugs possible. Nooooo, there needs to be more than that. I lately had a really good readme for a project in a heavy development phase. Basically everything I'd done, every command, every concept, got documented. That's worry about cleanup later. I did not put in every line of code, I put concepts. So when a new person got brought on and asked stuff like "well but how do I change the config?" Bam, it's in the readme. Over and over, every task I had to do, they had to at least consider or understand, so it's in the readme. Of course I did start with a quick-start "how do I use this repo" and only later did "how do I develop this repo" but still, it was all useful because it's what I needed.
It doesn't seem impossible for an LLM to go "hmmm, the way this repo passes configurations around isn't standard. I should focus more on that." But that's a level of understanding I don't think they currently have
- RealityVoid a day ago
  
  > But that's a level of understanding I don't think they currently have
  I think they do, at least in some of the cases, especially if it's something well represented in the dataset. I've been surprised sometimes by the insights it provides, and other times it's completely useless. That's one of the problems, it's unreliable, so you have to treat all info it gives you with doubt. But, anyways, at times it makes very surprising and seeming intelligent observations. It's worth at least considering it and thinking it through.
  - Neywiny a day ago
    
    I guess I should try it before dismissing it, but I would be curious to see if it can accurately detect which things we've found workarounds for that need special attention and whatnot.
    
    RealityVoid 11 hours ago
    
    As mentioned, it's a throw of the die. It can find very obscure things that you forgot about and give you good ideas. It might come with utterly stupid ideas. You clearly need to drive these things else they will drive you and you won't like where it will take you.
moomin a day ago

There’s a tool that used to be popular in the .NET community called GhostDoc, that did pretty much exactly what you’re describing: rewording the blindingly obvious. I loathed it. But in terms of filling a very specific and all-too-common niche of “My manager’s insisting we do this thing, but has allocated no time to it, and will never spend more than five minutes verifying that it is done” it was excellent. I feel like Google is just creating the next generation of that technology and it will be very effective at solving the same problem.
rkomorn a day ago

Sorry for the tangent but is there a story behind the choice of "brick lock/drink/wink" for your example?
It's so odd and random it seems like there must be more to it.
- cryzinger a day ago
  
  No story, just a truly random choice lol. I was trying to come up with something completely meaningless just to show how unhelpful those descriptions are, but believe me that I've encountered many real examples that were equally inscrutable :P

nrhrjrjrjtntbt a day ago

I think AI can be a tool to understand a cosebase but needs human insight to turn into real docs.

I have used AI to ask specific questions about a codebase and it has been helpful in narrowing the search base. Think of AI as probable cause, not evidence. It speeds up getting to the truth but cant be trusted as the truth.

pfannkuchen a day ago

Probably nobody thinks this is that good of an idea even in the company making it, it’s just that AI is being shoehorned into every possible thing currently and whoever is leading it needed a promotion.

redhale a day ago

> The process takes around ten minutes

And here's one of the primary issues. It's possible to create good docs with an agentic workflow, but it takes time (could be up to several hours depending on the repo size) and a ton of tokens, processing every line of code multiple times with multiple levels of summarization and different lenses of analysis and synthesis. This is very valuable for legacy systems no one understands, for instance.

The problem is that that doesn't scale to a DeepWiki-style product available for literally every repo on GitHub. It's way less flashy of a demo. And this is a demo.

ChrisArchitect a day ago

Google Releases CodeWiki

https://news.ycombinator.com/item?id=45926350

rurban 2 days ago

Previously discussed at https://news.ycombinator.com/item?id=45002092