Load svmlight format error(加载 svmlight 格式错误)
问题描述
当我尝试将 svmlight python 包 与我已转换为 svmlight 格式的数据一起使用时我得到一个错误.它应该是非常基本的,我不明白发生了什么.代码如下:
When I try to use the svmlight python package with data I already converted to svmlight format I get an error. It should be pretty basic, I don't understand what's happening. Here's the code:
import svmlight
training_data = open('thedata', "w")
model=svmlight.learn(training_data, type='classification', verbosity=0)
我也试过了:
training_data = numpy.load('thedata')
和
training_data = __import__('thedata')
推荐答案
一个明显的问题是您在打开数据文件时会截断它,因为您指定了写入模式 "w".这意味着将没有要读取的数据.
One obvious problem is that you are truncating your data file when you open it because you are specifying write mode "w". This means that there will be no data to read.
无论如何,如果您的数据文件类似于此 example,因为是python文件,所以需要导入.这应该有效:
Anyway, you don't need to read the file like that if your data file is like the one in this example, you need to import it because it is a python file. This should work:
import svmlight
from data import train0 as training_data    # assuming your data file is named data.py
# or you could use __import__()
#training_data = __import__('data').train0
model = svmlight.learn(training_data, type='classification', verbosity=0)
您可能希望将您的数据与示例的数据进行比较.
You might want to compare your data against that of the example.
数据文件格式明确后编辑
输入文件需要被解析成这样的元组列表:
The input file needs to be parsed into a list of tuples like this:
[(target, [(feature_1, value_1), (feature_2, value_2), ... (feature_n, value_n)]),
 (target, [(feature_1, value_1), (feature_2, value_2), ... (feature_n, value_n)]),
 ...
]
svmlight 包似乎不支持读取 SVM 文件格式的文件,并且没有任何解析功能,因此必须在 Python 中实现.SVM 文件如下所示:
The svmlight package does not appear to support reading from a file in the SVM file format, and there aren't any parsing functions, so it will have to be implemented in Python. SVM files look like this:
<target> <feature>:<value> <feature>:<value> ... <feature>:<value> # <info>
所以这里有一个解析器,可以将文件格式转换为 svmlight 包所需的格式:
so here is a parser that converts from the file format to that required by the svmlight package:
def svm_parse(filename):
    def _convert(t):
        """Convert feature and value to appropriate types"""
        return (int(t[0]), float(t[1]))
    with open(filename) as f:
        for line in f:
            line = line.strip()
            if not line.startswith('#'):
                line = line.split('#')[0].strip() # remove any trailing comment
                data = line.split()
                target = float(data[0])
                features = [_convert(feature.split(':')) for feature in data[1:]]
                yield (target, features)
你可以这样使用它:
import svmlight
training_data = list(svm_parse('thedata'))
model=svmlight.learn(training_data, type='classification', verbosity=0)
                        这篇关于加载 svmlight 格式错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:加载 svmlight 格式错误
				
        
 
            
        - 我如何透明地重定向一个Python导入? 2022-01-01
 - 我如何卸载 PyTorch? 2022-01-01
 - 使用公司代理使Python3.x Slack(松弛客户端) 2022-01-01
 - 计算测试数量的Python单元测试 2022-01-01
 - CTR 中的 AES 如何用于 Python 和 PyCrypto? 2022-01-01
 - YouTube API v3 返回截断的观看记录 2022-01-01
 - 检查具有纬度和经度的地理点是否在 shapefile 中 2022-01-01
 - 使用 Cython 将 Python 链接到共享库 2022-01-01
 - ";find_element_by_name(';name';)";和&QOOT;FIND_ELEMENT(BY NAME,';NAME';)";之间有什么区别? 2022-01-01
 - 如何使用PYSPARK从Spark获得批次行 2022-01-01
 
