由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
DataSciences版 - questions about SVD and ALSWR for collaborative filtering
相关主题
【内部推荐工作】征集版标
spark 问题求教! how to run python programs on a hadoop cluster
哪里有基于sparks的算法的书?为啥data science这么强调map reduce这些呢?
combining supervised learning and collaborative filtering?初入data science的困惑
怎么计算距离比较好?data scientist position
求问编程语言的选择,学stat的往DS努力转发一个RECRUITER给的工作机会
不知这样的大数据培训怎样?我想求职Big data Architect......BIG DATA TOPICS
[挖个坑]数据分析都有哪些开源工具呀?公司有一些hadoop的session,大家看看那些值得去听
相关话题的讨论汇总
话题: svd话题: alswr话题: filtering话题: m1话题: m2
进入DataSciences版参与讨论
1 (共1页)
s****h
发帖数: 3979
1
two questions:
1.
For recommendation engine based on collaborative filtering, the result of
ALSWR in Mahout would be very similar to result of SVD in MLlib of spark,
right?
As the SVD with spark + MLlib performance is very good, can we forget about
ALSWR in Mahout?
2.
How to evaluate SVD?
My understanding: for a known user/item matrix M, we remove some of the
known user/item pair and get new matrix M1, then do the SVD for M1 and get
the reconstructed matrix M2. Comparing removed user/item pairs between M and
M2, we can evaluate SVD.
This would make SVD evaluation very slow, as you might want to generate lots
of M1 matrix and do the SVD to get M2. And we know SVD is computationally
intensive here.
How people deal with the evaluation of ALSWR with mahout, the evaluate
process would be much longer, right?
thanks.
l******n
发帖数: 9344
2
Depend on your data, I do not think you can make a general conclusion.

about

【在 s****h 的大作中提到】
: two questions:
: 1.
: For recommendation engine based on collaborative filtering, the result of
: ALSWR in Mahout would be very similar to result of SVD in MLlib of spark,
: right?
: As the SVD with spark + MLlib performance is very good, can we forget about
: ALSWR in Mahout?
: 2.
: How to evaluate SVD?
: My understanding: for a known user/item matrix M, we remove some of the

s****h
发帖数: 3979
3
SVD对binary 数据,sparse数据不友好,所以用ALS.
其实我真正想知道的是:
不考虑计算复杂度,计算时间,并行性等等,仅在accuracy上相比较,SVD是否优于
ALSWR,毕竟SVD有个singular value,而且从空间转换的角度,更有说服力。
那么如果算法A比SVD好,是否就肯定比ALSWR好?
1 (共1页)
进入DataSciences版参与讨论
相关主题
公司有一些hadoop的session,大家看看那些值得去听怎么计算距离比较好?
关于data smoothing的问题求问编程语言的选择,学stat的往DS努力
hiring: Econometrician/Data Scientist不知这样的大数据培训怎样?我想求职Big data Architect......
回馈本版~ 最近面的面经和收集来的面经~[挖个坑]数据分析都有哪些开源工具呀?
【内部推荐工作】征集版标
spark 问题求教! how to run python programs on a hadoop cluster
哪里有基于sparks的算法的书?为啥data science这么强调map reduce这些呢?
combining supervised learning and collaborative filtering?初入data science的困惑
相关话题的讨论汇总
话题: svd话题: alswr话题: filtering话题: m1话题: m2