I am a new user of bigframes package from googleapis. I am trying to manipulate dataframe loaded from Bigquery.
I was trying to execute some code but I am facing a problem that i am not able to solve.
I am trying to use the apply function on a Dataframe with the parameter axis=1 , but it doesn’t seem to work. I always have an error message.
Can you please help me with this?
Thanks.
Code example
# example
def condition(row):
print(row )
if 1 <= row["month"] <= 6:
return f"{row['year']:02}S1{row['CODPY']}{row['CODDE']}"
else:
return f"{row['year']:02}S2{row['CODPY']}{row['CODDE']}"
valodetail_df['IDT'] = valodetail_df.apply(condition,axis=1)
Stack trace
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/core/log_adapter.py", line 44, in wrapper
return method(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/dataframe.py", line 3118, in apply
results = {name: func(col, *args, **kwargs) for name, col in self.items()}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/dataframe.py", line 3118, in <dictcomp>
results = {name: func(col, *args, **kwargs) for name, col in self.items()}
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<stdin>", line 3, in condition
File "missing.pyx", line 419, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous
>>> valodetail_df['IDTDCI'] = valodetail_df.apply(condition,axis=1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/core/log_adapter.py", line 44, in wrapper
return method(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/dataframe.py", line 3118, in apply
results = {name: func(col, *args, **kwargs) for name, col in self.items()}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/miniconda3/envs/qback/lib/python3.11/site-packages/bigframes/dataframe.py", line 3118, in <dictcomp>
results = {name: func(col, *args, **kwargs) for name, col in self.items()}
^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: condition() got an unexpected keyword argument 'axis'
2
Answers
axis=1
is currently not supported: https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.dataframe.DataFrame#bigframes_dataframe_DataFrame_applyThere is a feature request https://github.com/googleapis/python-bigquery-dataframes/issues/592 for the same.
However for your particular use case it is possible to achieve by other means.
Here is a guess of the kind of DataFrame you are working with:
We can use other DataFrame and Series APIs to create the desired column:
Hope this helps.
As of BigQuery DataFrames (bigframes) version 1.6.0 there is now a preview of support for
apply
withaxis=1
making use of BigQuery Remote Functions.If your function is doing something that couldn’t be expressed without an
axis=1
function as shown in Shobhit’s workaound in https://stackoverflow.com/a/78331896/101923, you can now do the following:Note: There are currently (bigframes==1.6.0) limitations to the data types that are supported. Your row can only contain INT64, FLOAT64, BOOL, or STRING columns (source).