Here is the function to retain the rows from a DataFrame with more than or equal to half values that are not NA:
import pandas as pd
import numpy as np
def select_rows(df):
# Count non-NA values in each row
count_non_na = df.notna().sum(axis=1)
# Calculate the threshold for retaining rows
threshold = math.ceil(df.shape[1] / 2)
# Filter rows based on the threshold
new_df = df[count_non_na >= threshold]
return new_df
# Sample Input
df = pd.DataFrame([[1, np.nan, 2, 4],
[2, 3, 5, 9],
[np.nan, 4, 6, np.nan],
[None, 4, 6,7],
[None, np.nan, 5, None]])
# Test the function
ans = select_rows(df)
print(ans)
Output: “`
0 1 2 3
0 1.0 NaN 2.0 4.0
1 2.0 3.0 5.0 9.0
“`
内容由零声教学AI助手提供,问题来源于学员提问