Detail¶
Config¶
-
class
caver.config.
Config
[source]¶ Basic config. All model config should inherit this.
-
batch_size
= 256¶ batch size
-
checkpoint_dir
= 'checkpoints'¶ checkpoint directory
-
dropout
= 0.15¶ dropout rate
-
embedding_dim
= 256¶ embedding dimension
-
epoch
= 10¶ epoch num for train
-
input_data_dir
= 'dataset'¶ data directory
-
lr
= 0.0001¶ learning rate
-
master_device
= 0¶ gpu device number
-
multi_gpu
= False¶ use multi gpu or not
-
output_data_dir
= 'processed_data'¶ save processed data directory
-
recall_k
= 5¶ recall@k
-
train_filename
= 'nlpcc_train.tsv'¶ train filename
-
valid_filename
= 'nlpcc_valid.tsv'¶ validation filename
-
-
class
caver.config.
ConfigCNN
[source]¶ CNN model config.
-
filter_num
= 6¶ filter number
-
filter_sizes
= [2, 3, 4]¶ list of filter size
-
model
= 'CNN'¶ model name
-
Data¶
Utils¶
-
class
caver.utils.
MiniBatchWrapper
(dl, x_var, y_vars)[source]¶ wrap the simple torchtext iter with multiple y label
-
caver.utils.
load_embedding
(embedding_file, dim, vocab_size, index2word)[source]¶ - Parameters
Load pre-trained embedding file.
First line of file should be the number of words and dimension of vector. Then each line is combined of word and vectors separated by space.
1024, 64 # 1024 words and 64-d a 0.223 0.566 ...... b 0.754 0.231 ...... ......