As you may have heard, Stack Overflow (SO) and OpenAI have recently announced a partnership, which basically just means allowing OpenAI to train models using data from SO and then selling these models. Kathy Reid summarised the problem perfectly:
Like many other technologists, I gave my time and expertise for free to #StackOverflow because the content was licensed CC-BY-SA - meaning that it was a public good. It brought me joy to help people figure out why their #ASR code wasn't working, or assist with a #CUDA bug.
Now that a deal has been struck with #OpenAI to scrape all the questions and answers in Stack Overflow, to train #GenerativeAI models, like #LLMs, without attribution to authors (as required under the CC-BY-SA license under which Stack Overflow content is licensed), to be sold back to us (the SA clause requires derivative works to be shared under the same license), I have issued a Data Deletion request to Stack Overflow to disassociate my username from my Stack Overflow username, and am closing my account, just like I did with Reddit, Inc.
https://policies.stackoverflow.co/data-request/
The data I helped create is going to be bundled in an #LLM and sold back to me.
In a single move, Stack Overflow has alienated its community - which is also its main source of competitive advantage, in exchange for token lucre.
Stack Exchange, Stack Overflow's former instantiation, used to fulfill a psychological contract - help others out when you can, for the expectation that others may in turn assist you in the future. Now it's not an exchange, it's #enshittification.
Programmers now join artists and copywriters, whose works have been snaffled up to create #GenAI solutions.
The silver lining I see is that once OpenAI creates LLMs that generate code - like Microsoft has done with Copilot on GitHub - where will they go to get help with the bugs that the generative AI models introduce, particularly, given the recent GitClear report, of the "downward pressure on code quality" caused by these tools?
While this is just one more example of #enshittification, it's also a salient lesson for #DevRel folks - if your community is your source of advantage, don't upset them.
Like many people who code, I relied on SO a lot when trying to debug errors. But we are seeing more and more people leaving the platform, with some trying to delete their answers but being banned for it!
Stack Overflow announced that they are partnering with OpenAI, so I tried to delete my highest-rated answers.
Stack Overflow does not let you delete questions that have accepted answers and many upvotes because it would remove knowledge from the community.
So instead I changed my highest-rated answers to a protest message.
Within an hour mods had changed the questions back and suspended my account for 7 days.
We are seeing another example of enshittification, which as described by Cory Doctorow (who came up with the term):
Here is how platforms die: first, they are good to their users; then they abuse their users to make things better for their business customers; finally, they abuse those business customers to claw back all the value for themselves. Then, they die. I call this enshittification, and it is a seemingly inevitable consequence arising from the combination of the ease of changing how a platform allocates value, combined with the nature of a "two sided market", where a platform sits between buyers and sellers, hold each hostage to the other, raking off an ever-larger share of the value that passes between them.
As long as money is involved, a platform will sell out as pointed out by Baldur Bjarnason:
One of the things that the Stack Overflow brouhaha demonstrates is that it doesn't matter if a service was founded by people trusted by the community (Atwood and Spolsky) and was broadly community-led. If it's a VC-funded startup, they will sell out their users at some point.
This brings me to the question of this post: How and where to get information? If every platform is going to eventually sell out its users, the ones who create value for the platform, what should we use? As pointed out by Kathy Reid, this reliance on models is going to create such a huge AI debt (like technical debt) and where are the users who use these tools going to find solutions for AI generated issues?
Maybe now is a good time to go back and do things the old fashioned way: reading manuals and books. I feel like there's going to be such a huge demand for human expertise in the years to come. Maybe RSS and blogs will make a come back too...
This work is licensed under a Creative Commons
Attribution 4.0 International License.
If you figure out where to get information let me know — I am unspeakable depressed about this. StackOverflow has been so useful to me and a future without it is painful to contemplate. I’m not sure exactly when it was we began to enter a future I want no part of, but we’re way inside now.
You and me both! I’ll keep watching this space and will write a follow up blog post in the future, if it still exists!