Sunday, 15 July 2012

python - How to apply linregress in Pandas bygroup -



python - How to apply linregress in Pandas bygroup -

i apply scipy.stats.linregress within pandas bygroup. had looked through documentation see how apply single column

grouped.agg(np.sum)

or function like

grouped.agg('d' : lambda x: np.std(x, ddof=1))

but how apply linregress has 2 inputs x , y?

the linregress function, many other scipy/numpy functions, accepts "array-like" x , y, both series , dataframe qualify.

for example:

from scipy.stats import linregress x = pd.series(np.arange(10)) y = pd.series(np.arange(10)) in [4]: linregress(x, y) out[4]: (1.0, 0.0, 1.0, 4.3749999999999517e-80, 0.0)

in fact, beingness able utilize scipy (and numpy) functions 1 of pandas killer features!

so if have dataframe can utilize linregress on columns (which series):

linregress(df['col_x'], df['col_y'])

and if using groupby can apply (to each group):

grouped.apply(lambda x: linregress(x['col_x'], x['col_y']))

python pandas

No comments:

Post a Comment