TensorFlow Data Validation 0.25.0
Version 0.25.0
Major Features and Improvements
-
Add support for detecting drift and distribution skew in numeric features.
-
tfdv.validate_statistics
now also reports the raw measurements of
distribution skew/drift (if any is done), regardless whether skew/drift is
detected. The report is in thedrift_skew_info
of theAnomalies
proto
(return value ofvalidate_statistics
). -
From this release TFDV will also be hosting nightly packages on
https://pypi-nightly.tensorflow.org. To install the nightly package use the
following command:pip install -i https://pypi-nightly.tensorflow.org/simple tensorflow-data-validation
Note: These nightly packages are unstable and breakages are likely to
happen. The fix could often take a week or more depending on the complexity
involved for the wheels to be available on the PyPI cloud service. You can
always use the stable version of TFDV available on PyPI by running the
commandpip install tensorflow-data-validation
.
Bug Fixes and Other Changes
- Added
tfdv.load_stats_binary
to load stats what were written using
tfdv.WriteStatisticsToText
(nowtfdv.WriteStatisticsToBinaryFile
). - Anomalies previously (un)classified as UKNOWN_TYPE now trigger more specific
anomaly types: DOMAIN_INVALID_FOR_TYPE, UNEXPECTED_DATA_TYPE,
FEATURE_MISSING_NAME, FEATURE_MISSING_TYPE, INVALID_SCHEMA_SPECIFICATION - Fixed a bug that
import tensorflow_data_validation
would fail if IPython
is not installed. IPython is an optional dependency of TFDV. - Depends on
apache-beam[gcp]>=2.25,<3
. - Depends on
tensorflow-metadata>=0.25,<0.26
. - Depends on
tensorflow-transform>=0.25,<0.26
. - Depends on
tfx-bsl>=0.25,<0.26
.
Known Issues
- N/A
Breaking Changes
tfdv.WriteStatisticsToText
is renamed as
tfdv.WriteStatisticsToBinaryFile
. The former is still available but will
be removed in a future release.
Deprecations
- N/A