Releases: googleapis/python-bigquery-dataframes
Releases · googleapis/python-bigquery-dataframes
v1.11.0
1.11.0 (2024-07-01)
Features
- Add .agg support for size (#792) (87e6018)
- Add
bigframes.bigquery.json_set
(#782) (1b613e0) - Add
bigframes.streaming.to_pubsub
method to create continuous query that writes to Pub/Sub (#801) (b47f32d) - Add
DataFrame.to_arrow
to create Arrow Table from DataFrame (#807) (1e3feda) - Add
PolynomialFeatures
support toto_gbq
and pipelines (#805) (57d98b9) - Add Series.peek to preview data efficiently (#727) (580e1b9)
- Expose gcf memory param in
remote_function
(#803) (014765c) - More informative error when query plan too complex (#811) (136dc24)
Bug Fixes
Documentation
v1.10.0
1.10.0 (2024-06-21)
Features
- Add dataframe.insert (#770) (e8bab68)
- Add groupby head API (#791) (44202bc)
- Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
- Bigframes.streaming module for continuous queries (#703) (0433a1c)
- Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)
Bug Fixes
- Allow
__repr__
to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9) - Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
- Ensure numpy version matches in
remote_function
deployment (#798) (324d93c) - Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
- Self-join optimization doesn't needlessly invalidate caching (#797) (1b96b80)
v1.9.0
1.9.0 (2024-06-10)
Features
- Allow functions returned from
bpd.read_gbq_function
to execute outside ofapply
(#706) (ad7d8ac) - Support
bigquery.vector_search()
(#736) (dad66fd) - Support
score()
in GeminiTextGenerator (#740) (b2c7d8b) - Support bytes type in
remote_function
(#761) (4915424) - Support fit() in GeminiTextGenerator (#758) (d751f5c)
Bug Fixes
- ARIMAPlus loads auto_arima_min_order param (#752) (39d7013)
- Improve to_pandas_batches for large results (#746) (61f18cb)
- Resolve issue with unset thread-local options (#741) (d93dbaf)
Documentation
v1.8.0
1.8.0 (2024-05-31)
Features
merge
only generates a default index if both inputs already have an index (#733) (25d049c)- Add
+
,-
as unary ops,^
binary op (#724) (968d825) - Add
GroupBy.size()
to get number of rows in each group (#479) (1fca588) - Add DataFrame
~
operator (#721) (354abc1) - Add GeminiText 1.5 Preview models (#737) (56cbd3b)
- Add slot_millis and add stats to session object (#725) (72e9583)
- Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
- Allow functions decorated with
bpd.remote_function()
to execute locally (#704) (d850da6) - Ensure
"bigframes-api"
label is always set on jobs, even if the API is unknown (#722) (1832778) - Support
ml.SimpleImputer
in bigframes (#708) (4c4415f) - Support type annotations to supply input and output types to
bpd.remote_function()
decorator (#717) (4a12e3c) - Support type annotations with
bpd.remote_function()
andaxis=1
(a preview feature) (#730) (e5a2992)
Bug Fixes
- Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
- Fix Null index assign series to column (#711) (ffb4b57)
- Set
bpd.remote_function()
sinput_types
andoutput_types
default toNone
to allow omitting them when type annotations are present (#729) (0e25a3b) - Warn and disable time travel for linked datasets (#712) (085fa9d)
Performance Improvements
Documentation
v1.7.0
1.7.0 (2024-05-20)
Features
read_gbq_query
supportsfilters
(9386373)read_gbq
suggests a correct column name when one is not found (9386373)- Add
DefaultIndexKind.NULL
to use asindex_col
inread_gbq*
, creating an indexless DataFrame/Series (#662) (29e4886) - Bigframes.bigquery.array_agg(SeriesGroupBy|DataFrameGroupby) (#663) (412f28b)
- To_datetime supports utc=False for string inputs (#579) (adf9889)
Bug Fixes
read_gbq_table
respects primary keys even whenfilters
are set (#689) (9386373)- Fix type error in test_cluster (#698) (14d81c1)
- Improve escaping of literals and identifiers (#682) (da9b136)
- Properly identify non-unique index in tables without primary keys (#699) (6e0f4d8)
- Remove a usage of the
resource
package when not available, such as on Windows (#681) (96243f2) - The imported samples error and use peek() (#688) (1a0b744)
Performance Improvements
- Don't run query immediately from
read_gbq_table
iffilters
is set (9386373) - Use a
LIMIT
clause whenmax_results
is set (9386373)
Documentation
v1.6.0
1.6.0 (2024-05-13)
Features
- Add
DataFrame.__delitem__
(#673) (2218c21) - Add
Series.case_when()
(#673) (2218c21) - Add
strategy="quantile"
in KBinsDiscretizer (#654) (c6c487f) - Add Series.combine (#680) (2fd1b81)
- Series.str.split (#675) (6eb19a7)
- Suggest correct options in bpd.options.bigquery.location (#666) (57ccabc)
- Support
axis=1
indf.apply
for scalar outputs (#629) (f6bdc4a) - Support gcf vpc connector in
remote_function
(#677) (9ca92d0) - Warn with a more specific
DefaultLocationWarning
category when no location can be detected (#648) (e084e54)
Bug Fixes
Dependencies
- Add jellyfish as a dependency for spelling correction (57ccabc)
Documentation
v1.5.0
1.5.0 (2024-05-07)
Features
bigframes.options
andbigframes.option_context
now uses thread-local variables to prevent context managers in separate threads from affecting each other (#652) (651fd7d)- Add
ARIMAPlus.coef_
property exposingML.ARIMA_COEFFICIENTS
functionality (#585) (81d1262) - Add a unique session_id to Session and allow cleaning up sessions (#553) (c8d4e23)
- Add the
bigframes.bigquery
sub-package with abigframes.bigquery.array_length
function (#630) (9963f85) - Always do a query dry run when
option.repr_mode == "deferred"
(#652) (651fd7d) - Custom query labels for compute options (#638) (f561799)
- Raise
NoDefaultIndexError
fromread_gbq
on clustered/partitioned tables with noindex_col
orfilters
set (#631) (73064dd) - Support
index_col=False
inread_csv
andengine="bigquery"
(73064dd) - Support gcf max instance count in
remote_function
(#657) (36578ab)
Bug Fixes
- Don't raise UnknownLocationWarning for US or EU multi-regions (#653) (8e4616b)
- Downgrade NoDefaultIndexError to DefaultIndexWarning (#658) (2715d2b)
- Fix bug with na in the column labels in stack (#659) (4a34293)
- Use explicit session in
PaLM2TextGenerator
(#651) (e4f13c3)
Documentation
v1.4.0
1.4.0 (2024-04-29)
Features
- Add .cache() method to persist intermediate dataframe (#626) (a5c94ec)
- Add transpose support for small homogeneously typed DataFrames. (#621) (054075d)
- Allow single input type in
remote_function
(#641) (3aa643f) - Expose gcf max timeout in
remote_function
(#639) (dfeaad0) - Series binary ops compatible with more types (#618) (518d315)
- Support the
score
method forPaLM2TextGenerator
(#634) (3ffc1d2)
Bug Fixes
- Allow to_pandas to download more than 10GB (#637) (ce56495)
- Extend row hash to 128 bits to guarantee unique row id (#632) (9005c6e)
- Llm fine tuning tests (#627) (4724a1a)
- Llm palm score tests (#643) (cf4ec3a)
Performance Improvements
- Automatically condense internal expression representation (#516) (03c1b0d)
- Cache transpose to allow performant retranspose (#635) (44b738d)
Documentation
v1.3.0
1.3.0 (2024-04-22)
Features
- Add
Series.struct.dtypes
property (#599) (d924ec2) - Add fine tuning
fit()
for Palm2TextGenerator (#616) (9c106bd) - Add quantile statistic (#613) (bc82804)
- Expose
max_batching_rows
inremote_function
(#622) (240a1ac) - Support primary key(s) in
read_gbq
by using as theindex_col
by default (#625) (75bb240) - Warn if location is set to unknown location (#609) (3706b4f)
Bug Fixes
- Address technical writers fb (#611) (9f8f181)
- Infer narrowest numeric type when combining numeric columns (#602) (8f9ece6)
- Use exact median implementation by default (#619) (9d205ae)
Documentation
v1.2.0
1.2.0 (2024-04-15)
Features
- Add hasnans, combine_first, update to Series (#600) (86e0f38)
- Add MultiIndex subclass. (#596) (5d0f149)
- Add pivot_table for DataFrame. (#473) (5f1d670)
- Add Series.autocorr (#605) (4ec8034)
- Support list of numerics in pandas.cut (#580) (290f95d)
Bug Fixes
- Address more technical writers feedback (#581) (4b08d92)
- Error for object dtype on read_pandas (#570) (8702dcf)
- Inverting int now does bitwise inversion rather than sign flip (#574) (5f1db8b)
- Loc setitem dtype issue. (#603) (b94bae9)
- Toc menu missing plotting name (#591) (eed12c1)