[ltp] Re: solid state drive?

Laurent linux-thinkpad@linux-thinkpad.org
Wed, 18 Feb 2009 22:19:39 +0100


Hi,

> The ext3 journaling combines multiple fileysystem operations into a
> single commit.  By default, commits take place every 5 seconds (or
> every 10 minutes if laptop mode is enabled); commits can happen sooner
> if there is memory pressure, or if the application explicitly calls
> fsync().  This tends to avoid small writes for the journal in general,
> unless there are problematic applications that are calling fsync() all
> of the time.

hmmm... i should check our software for unneeded syncs. Thanks for
the info. I always assumed the fs driver is writing each log entry
with a separate call to the block device.

>> The system fails if you have no more free blocks to play
>> with. Things go south once all your blocks are 70% valid
>> and contain old, "overwritten" data. At that point you
>> need a SSD-internal defrag to get a some fresh free blocks.
>> And the intels fail to do that.
>
> Where did you get the 70% figure from?

Tests with the intel. It starts to slow done once you wrote 170% of
the capacity in small DB commits.

> When you say intensive use, was this with the DB commit-logs or e-mail
> use case, or both?

First the one, then the other. Both degraded badly.

> (Of course, using ext2 on an e-mail server where there's no guarantee
> the filesystem will be consistent or that there won't be data loss
> after a system crash has its own problems.  I suspect the right answer
> for e-mail servers is to stick with HDD's, and if you really want to
> throw money at the problem and have scalability problems, use multiple
> servers and MX records for the front-end hub, and multiple PO/IMAP
> servers for the back-end.  I used to be on MIT's network operations
> group, so I know something about engineering very large scale mail
> server infrastructure.  We played with battery-backed DRAM's for the
> spool directory, but *man* that stuff was expensive....)

(a single server with a decent RAID can handle our mail fine. But the
RAID controller failed and looking for alternatives turned up 3 SSDs.
And for the DB ... sometimes you will do anything for a cheap
speedboost. It was one of those lets-try-it moments.)

> I suppose you're right that they're
> useful for read-only or read-mostly DB's, but that's not terribly
> interesting, is it?  :-)

Think about youtube. You have several TB of data doing mostly nothing
(stored and streamed from RAID/HDD, no problem) and ~ 500 - 2000 GB of
interessting data that is streamed all the time. It's hard to fill
GEthernet links from a RAID reading like ~25000 files at the
same time. Possible, but hard and expensive. RAM is another possible,
but even more expensive solution.

It is no problem with cheap SSDs. Just don't write while streaming.

We have the same problem, but with smaller datasets (about 5KB, about
800GB total). The software knows which datasets to get (~25 000 in
one go but at random positions). But we also modify and write back
parts of the data. And here the SSD failed, even with a non-ACID
DB (no/little commit log. Just 2KB writes at random positions).
Now people are running tests with a r/o-DB and a second one for
the changes. No results so far.

cu