MySQL 5.5 : Which one of the following is a better storage for a text/varchar field in innodb? -
requirement :
page#1 -> display users , 1-2 line preview of latest 10 blog posts
page#2 -> display single blogpost total text.
method 1 :
mysql table -> userid -> varchar 50 post_id -> integer post_title -> varchar 100 post_description -> varchar 10000
for page#1 , select user_id, post_title , post_description blog_table . , substring of post_description used show preview in listing.
for page#2 , select user_id , post_title, post_description post_id = n
method 2 :
mysql table -> userid -> varchar 50 post_id -> integer post_title -> varchar 100 post_brief -> varchar 250 post_description -> text
for page#1 , select user_id, post_title , post_brief blog_table .
for page#2 , select user_id , post_title, post_description post_id = n
does storing 2 columns, 1 brief varchar , 1 total text ( since accesses file scheme , , should queried when needed ) , worth performance benefit ?
since, method 2, store pointer text in row, whereas method 1 store total varchar 10k string in row. impact amount of table info can reside in ram , hence impact read performance of queries ?
the performance of sql queries depends on joins, clauses, grouping bys , order bys, not on columns retrieved. columns have noticable effect on query's speed if more info retrieved might have go on network processed programming language. not case here.
short answer: difference in performance between 2 proposed setups small.
for speed, post_id
column should have (unique) index. not selecting, sorting or grouping other column, info can come straight table, fast process.
you talking "pages" here, i'm guessing going presented users - seems unlikely want show table of thousands of blog posts on same page human, therefor have order and/or limit clauses in statements didn't include in question.
but lets bit deeper whole thing. lets assume reading tons of text columns straight hard disk, wouldn't nail drive's maximum reading speed? wouldn't retrieving varchar(250) faster, since saves left() call?
we can left() phone call off table real quick. string functions fast - after all, cpu cutting off of data, fast process. times when produce noticable delay when they're used in clauses, joins etc., not because functions slow, because have run lots (possibly millions) of times in order produce single row of results, , more so, because uses prevent database using indexes properly.
so in end comes downwards to: how fast can mysql read table contents database. , in turn depends on storage engine using , settings. mysql can utilize number of storage engines, including (but not limited to) innodb , myisam. both of these engines offer different file layouts big objects such text or blob columns (but funnily enough, varchars). if text column stored in different page rest of row, storage engine has retrieve 2 pages every row. if stored along rest, it'll 1 page. sequential processing major alter in performance.
here's bit of background reading on that:
blob storage in innodb myisam dynamic vs. compressed info file layouts
long answer: depends :)
you have number of benchmark tests on own hardware create phone call layout quicker. given sec setup introduces redundancy additional column, perform worse in scenarios. perform improve if - , if - table construction allows shorter varchar column fit same page on disk while long text column on page.
edit: more on text columns , performance
there seems mutual misconception blobs , in-memory processing. quite number of pages (including answers here on stackoverflow - i'll seek find them, , give additional comment) state text columns (and other blobs) cannot processed in memory mysql, , such performance hog. not true. what's happening this:
if run query involves text column and query needs temporary table processed, then mysql have create temporary table on disk rather in memory, because mysql's memory
storage engine cannot handle text columns. see this related question.
the mysql documentation states (the paragraph same versions 3.2 through 5.6):
instances of blob or text columns in result of query processed using temporary table causes server utilize table on disk rather in memory because memory storage engine not back upwards info types (see section 8.4.3.3, “how mysql uses internal temporary tables”). utilize of disk incurs performance penalty, include blob or text columns in query result if needed. example, avoid using select *, selects columns.
it lastly sentence confuses people - because bad example. simple select *
not affected performance problem because won't utilize temporary table. if same select illustration ordered non-indexed column, would have utilize temporary table , would affected problem. utilize explain
command in mysql find out whether query need temporary table or not.
by way: none of affects caching. text columns can cached else. if query needed temporary table , had stored on disk, result still cached if scheme had resources so, , cache not invalidated. in regard, text column else.
edit 2: more on text columns , memory requirements ...
mysql uses storage engine retrieve records disk. buffer results , hand them sequentially client. next assumes buffer ends in memory , not on disk (see above why)
for text columns (and other blobs), mysql buffer pointer actual blob. such pointer uses few bytes of memory, requires actual text content retrieved disk when row handed client. varchar columns (and else blobs), mysql buffer actual data. utilize more memory, because of texts going more few bytes. calculated columns, mysql buffer actual data, varchars.
a couple of notes on this: technically, blobs buffered when handed on client, 1 @ time - , big blobs perchance not in entirety. since buffer gets freed after each row, not have major effect. also, if blob stored in same page rest of row, may end beingness treated varchars. honest, i've never had requirement homecoming lots of blobs in single query, never tried.
now lets reply (now edited) question:
page #1. overview of users , short blog post snippets.
your options pretty much these queries
select userid, post_title, left(post_description, 250) `table_method_1` <-- calculated based on varchar column select userid, post_title, left(post_description, 250) `table_method_2` <-- calculated based on text column select userid, post_title, post_brief `table_method_2` <-- precalculated varchar column select userid, post_title, post_description `table_method_2` <-- homecoming total text, allow client produce snippet
the memory requirements of first 3 identical. 4th query require less memory (the text column buffered pointer,) more traffic client. since traffic on network (expensive in terms of performance,) tends slower other queries - mileage may vary. left() function on text column might sped telling storage engine utilize inlined table layout, depend on average length of text beingness stored.
page #2. single blog post
select userid, post_title, post_description `table_method_1` post_id=... <-- returns varchar select userid, post_title, post_description `table_method_2` post_id=... <-- returns text
the memory requirements low begin with, since 1 single row buffered. reasons stated above sec require tiny bit less memory buffer row, additional memory buffer single blob.
in either case, i'm pretty sure you're not concerned memory requirements of select that'll homecoming single row, not matter.
summary
if have text of arbitrary length (or requires more few kilobytes), should utilize text columns. that's they're there for. way mysql handles columns beneficial most of time.
there 2 things remember everyday use:
avoid selecting text columns, blob columns , other columns may have lots of info (and yes, includes varchar(10000)) if don't need them. habit of "select * whatever" when need couple of values set lot of unnecessary stress on database. when are selecting text columns or other blobs, create sure select not utilize temporary table. utilize
explain
syntax when in doubt.
when stick rules, should decent performance mysql. if need farther optimation that, you'll have @ finer details. include storage engines , respective table layouts, statistical info on actual data, , knowledge hardware involved. experience, rid of performance hogs without having dig deep.
mysql innodb