I have a SQL table and have made it a dataframe with Python.
Then I calculated new values for a new column.
As you know, it’s easy to add new columns with values to a dataframe as long as the length of the dataframe and the added column is the same.
Now I can take the whole dataframe and make a new MySQL table.
df.to_sql(name=key+".1", con=engine , if_exists ='fail', index=False)
But it seems a bit too complicated to me to do it this way.
So I tried to insert the values into the new table by using
# for index,x in enumerate(df["Time"]): # try: # mycursor = mydb.cursor() # mycursor.execute("UPDATE "+key +" SET DhE="+str(df["DhE"][index])+" WHERE Time='"+str(x)+"'") # mydb.commit() # except: # print("UPDATE "+key+" SET DhE="+str(df["DhE"][index])+" WHERE Time="+str(x) +"****error ")
This is not usable.
-
The "WHERE" clause compares the datatype Date() as the primary key of the table, which was a bad choice.
Don´t do this, this is a unique key but not usable in practice. -
By default, a where clause is way too slow for a table length of 731911 to add values for a whole column.
My conclusion and my question are
Do I have to write a whole new Table into my database, or is there a way to just add the Column (with values!), which has the same length as the table, to the MySQL Table ?
for example, using
INSER INTO XY.new_Column VALUES "whole new column values"
2
Answers
Thank you very much for your answer.
I got a warning,
that VALUES function' is deprecated and will be removed in a future release. Please use an alias (INSERT INTO ... VALUES (...) AS alias) and replace VALUES(col) in the ON DUPLICATE KEY UPDATE clause with alias.col instead .
I changed it to:
In addition, the "time" column has to be the primary key so I set
and
After that, sql is able to get an key error and DUPLICATE KEY UPDATE can do its magic.
Now I will go to benchmark this against adding the whole new table :)
The way you are inserting values into the MySQL table is very inefficient. It iterates through each row and updates it one by one. This results in a very high number of individual queries to the database, which is extremely slow.
ALTER TABLE
to add the new column.INSERT INTO ... ON DUPLICATE KEY UPDATE
.In Python, this should work.
Replace
your_table
with your table name,new_column_name
with the column name, anddatatype
with the column’s datatype. Also, havevalues_list
as a list of tuples with the time and value for the new column.