python - How to apply linregress in Pandas bygroup -
i apply scipy.stats.linregress within pandas bygroup. had looked through documentation see how apply single column
grouped.agg(np.sum)
or function like
grouped.agg('d' : lambda x: np.std(x, ddof=1))
but how apply linregress has 2 inputs x , y?
the linregress
function, many other scipy/numpy functions, accepts "array-like" x , y, both series , dataframe qualify.
for example:
from scipy.stats import linregress x = pd.series(np.arange(10)) y = pd.series(np.arange(10)) in [4]: linregress(x, y) out[4]: (1.0, 0.0, 1.0, 4.3749999999999517e-80, 0.0)
in fact, beingness able utilize scipy (and numpy) functions 1 of pandas killer features!
so if have dataframe can utilize linregress on columns (which series):
linregress(df['col_x'], df['col_y'])
and if using groupby can apply
(to each group):
grouped.apply(lambda x: linregress(x['col_x'], x['col_y']))
python pandas
No comments:
Post a Comment