How to split a pandas DataFrame into specifically sized chunks with random contents in Python

Splitting a pandas DataFrame into predetermined sized chunks with random contents means dividing a DataFrame up proportionally by size, with the elements of each division being a random selection of elements from the original DataFrame. Individual rows do not repeat. For example, splitting a DataFrame of five entries into 60% and 40% chunks results in a DataFrame of three rows and a DataFrame of two rows.

Solution for How to split a pandas DataFrame into specifically sized chunks with random contents in Python : You can use pandas.DataFrame.sample() and pandas.DataFrame.drop() to split a pandas DataFrame into specifically sized chunks with random contents Call pandas.DataFrame.sample(frac=split_percent, random_state) with split_percent as the decimal representation of the percentage to be taken to create a splinter DataFrame of the specified size. random_state specifies the random seed and can be left blank in most cases, but is specified here if necessary. Call pandas.DataFrame.drop(first_percentage.index) with first_percentage as the DataFrame made from calling pandas.DataFrame.sample() to drop all elements in that DataFrame from the original and be left with the remaining rows.


how-to-split-a-pandas-dataframe-into-specifically-sized-chunks-with-random-contents-in-python