eBay architecture
2006-12-05 15:53:34.765927+00 by
Dan Lyke
6 comments
Some interesting notes on scalable system architecture in Johannes Ernst's notes on a talk by Dan Pritchett and Randy Shoup about the eBay Architecture, and Dan Pritchett's slides on the talk in PDF. Among other things, it talks about moving CPU intensive work, joins, sorting and referential integrity, out of the database layer and into the application layer, because application layer stuff is more easily distributed across processors.
As I consider going to a hosted solution for my web sites, and my development process in general, makes me re-think my preference for PostgreSQL (a real database) versus MySQL (still largely a bunch of flat files with an SQL layer on top).
And Dan Pritchett's weblog looks like a must read if you're building large scalable database driven systems.
[ related topics:
Software Engineering Databases
]
comments in ascending chronological order (reverse):
#Comment Re: made: 2006-12-05 16:09:33.740028+00 by:
Dan Lyke
This also makes me think that maybe some of the work towards replication is the wrong way to go when trying to optimize database architectures, that splitting the database system in half so that the CPU intensive portions of queries can be pulled to separate systems (not just separate CPUs) could leave you with a real database that still distributes more cleanly.
Of course with everyone going to fancy object wrappers for their SQL data, there's probably no point to that.
#Comment Re: made: 2006-12-05 17:33:44.600684+00 by:
meuon
No, not everyone is going to 'fancy object wrappers'. But I do agree that putting -some- if not most of the heavy lifting outside of the databaae itself makes some sense. I do it so I can break the rules (very carefully). and because in a changing live application environment, I can change code, with if statements for when to apply which rules, and when to fudge on them.
#Comment Re: made: 2006-12-05 19:14:26.614009+00 by:
Mark A. Hershberger
Then there's this:
http://spyced.blogspot.com/200...k-postgresql-beats-stuffing.html
"PostgreSQL beats the stuffing out of MySQL" -- especially in multi-core hardware.
So, if all your data is already in PG, I'd think about keeping it there and, maybe, just dropping all the fun logic in your database.
#Comment Re: made: 2006-12-05 20:08:31.858428+00 by:
Dan Lyke
I haven't looked at the internals of PostgreSQL since sometime back in the migration between versions 6 and 7, but it sounds like maybe they've implemented that layer of separation...
And I just put permissions on the wiki pages and re-enabled them, and I'm pretty sure that the query I used to update the permissions on Flutterby users that I deemed worthy of trust would have tied MySQL in knots, given the number of radical subselects and counts involved in it.
#Comment Re: made: 2006-12-05 23:22:54.891965+00 by:
meuon
given the number of radical subselects and counts involved - sounds like funky organic database schema's that need pruning and grafting if not replanting in new pots.
#Comment Re: made: 2006-12-05 23:32:04.273835+00 by:
Dan Lyke
Kinda, the one quirk is that permissions are a link between user and weblog, with the ability to have multiple weblogs, and that articles are an object that can be a weblog entry, a weblog comment, a wiki comment, a....
But mostly it was the heuristics I wanted to apply to allow the permission, which revolved around some number of legitimate posts and things like that. So there were a few WHERE (SELECT COUNT(*) FROM articles WHERE articles.author_id=users.id AND [some trait about articles]) > 5 AND...
type constructions.