tfx初体验
背景
tfx全称是tensorflow extended,是一个端到端平台,用于部署生产型机器学习流水线。
环境搭建
环境,和官网教程保持一致,如下:
- python: 3.6
- tensorflow: 2.4.1
- tfx: 0.29.0
安装Tensorflow
安装python3.6
conda init bash
conda config --add channels https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
conda create -n tfx python=3.6
conda activate tfx
升级pip
pip install --upgrade pip
安装tf和tfx
python -m pip install tfx==0.29.0 tensorflow==2.4.1 -i https://mirrors.aliyun.com/pypi/simple
结果
Successfully installed MarkupSafe-1.1.1 Send2Trash-1.5.0 apache-beam-2.28.0 appnope-0.1.2 argon2-cffi-20.1.0 async-generator-1.10 attrs-20.3.0 avro-python3-1.9.1 backcall-0.2.0 bleach-3.3.0 cffi-1.14.5 click-7.1.2 colorama-0.4.4 crcmod-1.7 dataclasses-0.8 decorator-5.0.7 defusedxml-0.7.1 dill-0.3.1.1 docker-4.4.4 docopt-0.6.2 entrypoints-0.3 fastavro-1.4.0 fasteners-0.16 future-0.18.2 google-api-core-1.26.3 google-api-python-client-1.12.8 google-apitools-0.5.31 google-auth-httplib2-0.1.0 google-cloud-bigquery-1.28.0 google-cloud-bigtable-1.7.0 google-cloud-build-2.0.0 google-cloud-core-1.6.0 google-cloud-datastore-1.15.3 google-cloud-dlp-1.0.0 google-cloud-language-1.3.0 google-cloud-pubsub-1.7.0 google-cloud-spanner-1.19.1 google-cloud-storage-1.37.1 google-cloud-videointelligence-1.16.1 google-cloud-vision-1.0.0 google-crc32c-1.1.2 google-resumable-media-1.2.0 googleapis-common-protos-1.53.0 grpc-google-iam-v1-0.12.3 grpcio-gcp-0.2.2 hdfs-2.6.0 httplib2-0.17.4 ipykernel-5.5.3 ipython-7.16.1 ipython-genutils-0.2.0 ipywidgets-7.6.3 jedi-0.18.0 jinja2-2.11.3 joblib-0.14.1 jsonschema-3.2.0 jupyter-client-6.2.0 jupyter-core-4.7.1 jupyterlab-pygments-0.1.2 jupyterlab-widgets-1.0.0 keras-tuner-1.0.1 kubernetes-11.0.0 libcst-0.3.18 mistune-0.8.4 ml-metadata-0.29.0 ml-pipelines-sdk-0.29.0 mock-2.0.0 mypy-extensions-0.4.3 nbclient-0.5.3 nbconvert-6.0.7 nbformat-5.1.3 nest-asyncio-1.5.1 notebook-6.3.0 oauth2client-4.1.3 packaging-20.9 pandas-1.1.5 pandocfilters-1.4.3 parso-0.8.2 pbr-5.5.1 pexpect-4.8.0 pickleshare-0.7.5 prometheus-client-0.10.1 promise-2.3 prompt-toolkit-3.0.18 proto-plus-1.18.1 ptyprocess-0.7.0 pyarrow-2.0.0 pycparser-2.20 pydot-1.4.2 pygments-2.8.1 pymongo-3.11.3 pyparsing-2.4.7 pyrsistent-0.17.3 python-dateutil-2.8.1 pytz-2021.1 pyyaml-5.4.1 pyzmq-22.0.3 scikit-learn-0.24.1 scipy-1.5.4 tabulate-0.8.9 tensorflow-cloud-0.1.13 tensorflow-data-validation-0.29.0 tensorflow-datasets-3.0.0 tensorflow-hub-0.9.0 tensorflow-metadata-0.29.0 tensorflow-model-analysis-0.29.0 tensorflow-serving-api-2.4.1 tensorflow-transform-0.29.0 terminado-0.9.4 terminaltables-3.1.0 testpath-0.4.4 tfx-0.29.0 tfx-bsl-0.29.0 threadpoolctl-2.1.0 tornado-6.1 tqdm-4.60.0 traitlets-4.3.3 typing-inspect-0.6.0 uritemplate-3.0.1 wcwidth-0.2.5 webencodings-0.5.1 websocket-client-0.58.0 widgetsnbextension-3.5.1
查看版本
>>> import tensorflow as tf
>>> import tfx
>>> print('TensorFlow version: {}'.format(tf.__version__))
TensorFlow version: 2.4.1
>>> print('TFX version: {}'.format(tfx.__version__))
TFX version: 0.29.0
安装tfma (TensorFlow Model Analysis)
pip install tensorflow_model_analysis tfx==0.29.0 tensorflow==2.4.1 -i https://mirrors.aliyun.com/pypi/simple
查看版本
>>> import tensorflow_model_analysis as tfma
>>> print('TFMA version: {}'.format(tfma.__version__))
TFMA version: 0.29.0
>>> import apache_beam as beam
>>> print('Beam version: {}'.format(beam.__version__))
Beam version: 2.28.0
故事背景
使用芝加哥市发布的出租车行车数据集,该数据集中的列为取车时间、取车位置、下车时间、下车位置、付款方式,来预测小费情况。
使用jupyter
使用指定版本的python解释器的jupyter
python /Users/xuqian/Library/Python/3.7/bin/jupyter-notebook