Write a JSON file from a Dataframe in Pola-rs
Starting with a DataFrame that has a form such as this df = pl.DataFrame([{"SkuId":1}]) shape: (1, 1) ┌───────┐ │ SkuId │ │ --- │ │ i64 │ ╞═══════╡ │ 1 │ └───────┘ How can I write it to a JSON…
Starting with a DataFrame that has a form such as this df = pl.DataFrame([{"SkuId":1}]) shape: (1, 1) ┌───────┐ │ SkuId │ │ --- │ │ i64 │ ╞═══════╡ │ 1 │ └───────┘ How can I write it to a JSON…
Dependency DAG Description Pretty straight forward, basically, I am reading some parquet files from disk using polars which are the source of data. Doing some moderately heavy duty processing (a few million rows) to generate an intermediate data frame, then…
I have a table in postgres and basically i'm trying to duplicate rows based on the delimiter @ in the description column. Here is my table: txn_id description 3332654 [email protected]@0.9397@$10.64@[email protected]@23.8235@$6.30@KRW@36,[email protected]@$41.84@[email protected]@1.5711@$12.73@[email protected]@0.6013@$8.32@[email protected]@10.6013@$18.32 3332655 [email protected]@0.8197@$11.64@[email protected]@21.8135@$61.30@KRW@36,[email protected]@$411.84@[email protected]@11.5711@$11.73 Using the below postgresql code, the rows are…
I'd like to generate a html output of a dataframe with thousand separators in the output. However pl.Config does not seem to do anything: import polars as pl df = pl.DataFrame({'prod':['apple','banana','melon'], 'price':[7788, 1122, 4400]}) with pl.Config( thousands_separator=" " ): html…
I have a column of long strings (like sentences) on which I want to do the following: replace certain characters create a list of the remaining strings if a string is all text see whether it is in a dictionary…
I have ~100GB csv file with following columns: sex;name;dob;hash This files was created after some processing of another .csv file. And it can contain tuples, that's why there is this hash column. What I need is to delete duplicates from…
I want to store a datframe from a parquet file into a PostgreSQL using Polars using this code: def store_in_postgresql(df): password = 'anon' username = 'postgres' database = 'nyc_taxis' uri = f'postgresql://{username}:{password}@localhost:5432/{database}' engine = create_engine(uri) common_sql_state = "SQLSTATE: 42P07" try:…
The title says it all. Here is the code snippet. async with EngineContext(uri=URI) as engine: session = async_sessionmaker(bind=engine, expire_on_commit=True)() async with session.begin(): stmt: Select = select(User).order_by(_GenerativeSelect__first=User.login_date.desc()).limit(limit=10) result = await session.execute(statement=stmt) Equivalent to the very simple query, SELECT * FROM User…