MySQL Database Design - Specific Case, Columns or additional table?

Go To StackoverFlow.com

1

I am really only a hobbyist, who has aspirations far too grand, that said, I am trying to figure out the right way to create my database so that database changes don't require client refactoring, but also fast. Please respond in as if I wouldn't understand typical development or DBA terminology well.

The situation: I was trying to determine how many books each user has rated. I consider a book rated if it has any two of the following:

-Overall rating (ratings table)
-sub rating (ratings table)
-tag (book_tags table)
-list (book_lists table)

*Related tables: users, tags, lists

The problem: I have 10 sub ratings and the two overall ratings all in the ratings table, each in a column (guessing this is bad, but not sure). Should I instead have a ratings table (12 rows) and a book_ratings table where each row of the ratings table is a type of rating for a user?

-e.g. book_ratings: id  |  user_id  |  book_id  |  rating_id  

If yes, what happens if there are 500k books, 12 rating types per book, 10,000 users and a total of 5 billion rows on that book_ratings table? is that going to run super slow? Another consideration is that I may want to add more sub rating types in the future, which is partially why I think it might be valuable to change it, but it's a lot of work, so I wanted to check first.

Thanks!

2012-04-05 22:25
by Jed Grant


1

You should model your system to make it usable and extendable. Having 12 rating columns is going to cause you a lot of pain when you want to aggregate results etc. There are plenty of examples of those kinds of pains on this site.

As it grows you optimize by adding indexes, clustering, data partitioning, etc.

But if you know you are going to have massive amounts of data right away you might want to consider some "Big Data" solutions and perhaps go the NoSQL way..

2012-04-05 22:36
by barsju
Any chance you can point me at some information about optimizing? Something suitably basic? The other day it was REALLY slowing down until I added indexes to all the fields I use in various queries, but even that was done like a blind man wandering in the dark. Ideally, over the long term, I would love to have 200,000 users and 1 million books as part of the database if it goes well (Pipe dream, but hey :). I would like to know how to plan ahead if possible - Jed Grant 2012-04-06 01:56
Optimizing this things is a bit of a black art. I am not familiar with MySQL, but in SQL Server you can display the 'execution plan' of a query - i.e. a details of what the server is doing to execute the query, this is very useful for optimization, in the case of SQL Server it even direcly suggest which indexes should be added. I assume that there is something similar for MySQL - MiMo 2012-04-06 14:12


1

Yes, I would change the structure as you describe - it is more flexible and more 'correct' (normalized).

You'll have 5 billion rows (that would indeed be bad) only if EACH user gives ALL the ratings to ALL the books, that seems unlikely. A great majority of users would not rate anything, and a great majority of books will not attract any rating.

2012-04-05 22:35
by MiMo
Ads