由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
DataSciences版 - What's the best way to convert text/csv file into PARQUET
相关主题
How to load csv file converted from excel file into Cloudera Hive or Impala?请教做 data science 的 ICCC
Impala v Hive讨论,(Big)Data Engineer到底是个什么职位
学习Pig Latin求职招聘高薪IT,你想不成功都难
Re: MapR Technologies continue hiring a lot of positions (转载)不知这样的大数据培训怎样?我想求职Big data Architect......
哪里可以免费的练习一下pig/hive/spark的?找DS的工作 帮忙分析下
大数据这个东西,如果用hive,岂不是跟SQL差不多了克劳迪娅包怎么用啊
big data software engineer或者data scientist 工作机会推荐 (转载)hadoop的经验怎么攒?
急,跪求答案 (moving avg using spark dataframe window functions)请教一下如何快速复习/学习DS的核心知识
相关话题的讨论汇总
话题: parquet话题: apache话题: drill话题: convert话题: csv
进入DataSciences版参与讨论
1 (共1页)
s****h
发帖数: 3979
1
I have text/csv files and want to upload them into Cloudera cluster, and use
them in Spark.
What's the best way to upload and convert text/csv file into PARQUET format?
Two load, use either file manager in Hue or SFTP?
To convert, I can think of 3 ways:
A.
In HIVE, create external table based on the original file,
then create new external table in PARQUET format ?
B.
In Spark, wse Scala code to convert ? Conversion speed might be a concern.
https://developer.ibm.com/hadoop/blog/2015/12/03/parquet-for-sp
C.
Using Apache Drill? Anyone has installed Apache Drill on CDH before?
Conversion speed would be better. https://www.mapr.com/blog/how-convert-csv-
file-apache-parquet-using-apache-drill
Need install Apache Drill first: https://drill.apache.org/docs/installing-
drill-on-the-cluster/
With Sqoop, it's much easier as we have setting "--as-parquetfile".
Thanks!
c*******n
发帖数: 679
1 (共1页)
进入DataSciences版参与讨论
相关主题
请教一下如何快速复习/学习DS的核心知识哪里可以免费的练习一下pig/hive/spark的?
有没有人想报Cloudera的Data Scientist Certificate的大数据这个东西,如果用hive,岂不是跟SQL差不多了
想转行Data Science, 求建议big data software engineer或者data scientist 工作机会推荐 (转载)
请问今年有Big Data的短期training培训吗(美国)?急,跪求答案 (moving avg using spark dataframe window functions)
How to load csv file converted from excel file into Cloudera Hive or Impala?请教做 data science 的 ICCC
Impala v Hive讨论,(Big)Data Engineer到底是个什么职位
学习Pig Latin求职招聘高薪IT,你想不成功都难
Re: MapR Technologies continue hiring a lot of positions (转载)不知这样的大数据培训怎样?我想求职Big data Architect......
相关话题的讨论汇总
话题: parquet话题: apache话题: drill话题: convert话题: csv