Page 1 of 1

sequential file stage property

Posted: Tue Nov 24, 2015 2:35 am
by wuruima
Read from multiple nodes
Number Of readers per node

Some days ago I just learnt we can set "Number Of readers per node" to improve the performance to read a cma file, that means there are some readers reading in the same time in one node.

And then I realize this one "Read from multiple nodes", according to the document, I think it's like the parallel run, 4 nodes read the same file the same time.

my question is, in my view, sequential file should read in sequence, if we set either the properties, then the file will process "parallelly", is this understanding correct?

And we should always set this property to read seq file, so that we can have better performance, right?

Thanks..
Walter/

Posted: Tue Nov 24, 2015 2:55 pm
by qt_ky
I believe those settings only apply to fixed-width files, not delimited files. The benefits of using the settings may be questionable unless justified. It really depends on your topology, horsepower available, and the size of the fixed-width files. I would say not to always set those settings, but only do so when you have the resources and can measure an overall performance improvement.

Posted: Tue Nov 24, 2015 3:30 pm
by chulett
qt_ky wrote:I believe those settings only apply to fixed-width files, not delimited files.
Correct!

Posted: Tue Nov 24, 2015 4:17 pm
by ray.wurlod
Actually these properties can be used with delimited files, though it's not as efficient as with fixed-width format files.

Posted: Wed Nov 25, 2015 7:48 pm
by wuruima
haha, I didn't notice this point actually when i read the doc.

And yes I used it in delimit file and the performance is better.

Posted: Wed Nov 25, 2015 7:54 pm
by wuruima
I think.
fixed-length rows != fixed-width format file