Skip to content

Commit 91a1258

Browse files
authored
Merge pull request #96 from NTMC-Community/dev
Version 1.1
2 parents 8883d09 + 709b6c2 commit 91a1258

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+6288
-1418
lines changed

.flake8

+5
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,9 @@
11
[flake8]
2+
3+
# Maximum number of characters on a single line. Ideally, lines should be under 79 characters,
4+
# but we allow some leeway before calling it an error.
5+
max-line-length = 90
6+
27
ignore =
38
# D401 First line should be in imperative mood
49
D401,

.gitattributes

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
tutorials/* linguist-vendored

.gitignore

+2-1
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,5 @@ notebooks/wikiqa/.ipynb_checkpoints/*
2626
.cache
2727
.tmpdir
2828
htmlcov/
29-
docs/_build
29+
docs/_build
30+
matchzoo_py.egg-info/

CODEOWNERS

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
# to get notified about changes in a specific package.
88
# See https://help.github.com/articles/about-teams how to setup teams.
99

10-
# Define individuals or teams that are responsible for code in a repository.
10+
# Define individuals or teams that are responsible for code in a repository.
1111

1212
# global owner.
1313
* @faneshion

CONTRIBUTING.md

+17-17
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,23 @@
1-
Contributing to MatchZoo
1+
Contributing to MatchZoo-py
22
----------
33

4-
> Note: MatchZoo is developed under Python 3.6.
4+
> Note: MatchZoo-py is developed under Python 3.6.
55
6-
Welcome! MatchZoo is a community project that aims to work for a wide range of NLP and IR tasks such as Question Answering, Information Retrieval, Paraphrase identification etc. Your experience and what you can contribute are important to the project's success.
6+
Welcome! MatchZoo-py is a community project that aims to work for a wide range of NLP and IR tasks such as Question Answering, Information Retrieval, Paraphrase identification etc. Your experience and what you can contribute are important to the project's success.
77

88
Discussion
99
----------
1010

11-
If you've run into behavior in MatchZoo you don't understand, or you're having trouble working out a good way to apply it to your code, or you've found a bug or would like a feature it doesn't have, we want to hear from you!
11+
If you've run into behavior in MatchZoo-py you don't understand, or you're having trouble working out a good way to apply it to your code, or you've found a bug or would like a feature it doesn't have, we want to hear from you!
1212

1313
Our main forum for discussion is the project's [GitHub issue tracker](https://github.com/NTMC-Community/MatchZoo-py/issues). This is the right place to start a discussion of any of the above or most any other topic concerning the project.
1414

15-
For less formal discussion we have a chat room on WeChat (mostly Chinese speakers). MatchZoo core developers are almost always present; feel free to find us there and we're happy to chat. Please add *YQ-Cai1198593462* as your WeChat friend, she will invite you to join the chat room.
15+
For less formal discussion we have a chat room on WeChat (mostly Chinese speakers). MatchZoo-py core developers are almost always present; feel free to find us there and we're happy to chat. Please add *YQ-Cai1198593462* as your WeChat friend, she will invite you to join the chat room.
1616

1717
First Time Contributors
1818
-----------------------
1919

20-
MatchZoo appreciates your contribution! If you are interested in helping improve MatchZoo, there are several ways to get started:
20+
MatchZoo-py appreciates your contribution! If you are interested in helping improve MatchZoo-py, there are several ways to get started:
2121

2222
* Work on [new models](https://github.com/NTMC-Community/awaresome-neural-models-for-semantic-match).
2323
* Work on [tutorials](https://github.com/NTMC-Community/MatchZoo-py/tree/master/tutorials).
@@ -27,19 +27,19 @@ MatchZoo appreciates your contribution! If you are interested in helping improve
2727
Submitting Changes
2828
------------------
2929

30-
Even more excellent than a good bug report is a fix for a bug, or the implementation of a much-needed new model.
30+
Even more excellent than a good bug report is a fix for a bug, or the implementation of a much-needed new model.
3131

3232
(*) We'd love to have your contributions.
3333

3434
(*) If your new feature will be a lot of work, we recommend talking to us early -- see below.
3535

36-
We use the usual GitHub pull-request flow, which may be familiar to you if you've contributed to other projects on GitHub -- see below.
36+
We use the usual GitHub pull-request flow, which may be familiar to you if you've contributed to other projects on GitHub -- see below.
3737

38-
Anyone interested in MatchZoo may review your code. One of the MatchZoo core developers will merge your pull request when they think it's ready.
38+
Anyone interested in MatchZoo-py may review your code. One of the MatchZoo-py core developers will merge your pull request when they think it's ready.
3939
For every pull request, we aim to promptly either merge it or say why it's not yet ready; if you go a few days without a reply, please feel
4040
free to ping the thread by adding a new comment.
4141

42-
For a list of MatchZoo core developers, see [README](https://github.com/NTMC-Community/MatchZoo-py/blob/master/README.md).
42+
For a list of MatchZoo-py core developers, see [README](https://github.com/NTMC-Community/MatchZoo-py/blob/master/README.md).
4343

4444
Contributing Flow
4545
------------------
@@ -55,12 +55,12 @@ Contributing Flow
5555

5656

5757
Your PR will be merged if:
58-
- Funcitonally benefit for the project.
59-
- Passed Countinuous Integration (all unit tests, integration tests and [PEP8](https://www.python.org/dev/peps/pep-0008/) check passed).
60-
- Test coverage didn't decreased, we use [pytest](https://docs.pytest.org/en/latest/).
61-
- With proper docstrings, see codebase as examples.
62-
- With type hints, see [typing](https://docs.python.org/3/library/typing.html).
63-
- All reviewers approved your changes.
58+
- Funcitonally benefit for the project.
59+
- Passed Countinuous Integration (all unit tests, integration tests and [PEP8](https://www.python.org/dev/peps/pep-0008/) check passed).
60+
- Test coverage didn't decreased, we use [pytest](https://docs.pytest.org/en/latest/).
61+
- With proper docstrings, see codebase as examples.
62+
- With type hints, see [typing](https://docs.python.org/3/library/typing.html).
63+
- All reviewers approved your changes.
6464

6565

66-
**Thanks and let's improve MatchZoo together!**
66+
**Thanks and let's improve MatchZoo-py together!**

MANIFEST.in

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
recursive-include matchzoo/datasets/toy *
1+
recursive-include matchzoo/datasets/toy *

README.md

+24-5
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ valid_pack = mz.datasets.wiki_qa.load_data('dev', task=ranking_task)
8585
Preprocess your input data in three lines of code, keep track parameters to be passed into the model:
8686

8787
```python
88-
preprocessor = mz.models.DSSM.get_default_preprocessor()
88+
preprocessor = mz.models.ArcI.get_default_preprocessor()
8989
train_processed = preprocessor.fit_transform(train_pack)
9090
valid_processed = preprocessor.transform(valid_pack)
9191
```
@@ -106,7 +106,7 @@ validset = mz.dataloader.Dataset(
106106

107107
Define padding callback and generate data loader:
108108
```python
109-
padding_callback = mz.models.DSSM.get_default_padding_callback()
109+
padding_callback = mz.models.ArcI.get_default_padding_callback()
110110

111111
trainloader = mz.dataloader.DataLoader(
112112
dataset=trainset,
@@ -125,7 +125,7 @@ validloader = mz.dataloader.DataLoader(
125125
Initialize the model, fine-tune the hyper-parameters:
126126

127127
```python
128-
model = mz.models.DSSM()
128+
model = mz.models.ArcI()
129129
model.params['task'] = ranking_task
130130
model.params['vocab_size'] = preprocessor.context['vocab_size']
131131
model.guess_and_fill_missing_params()
@@ -188,8 +188,13 @@ python setup.py install
188188
- [ConvKNRM](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/conv_knrm.py): this model is an implementation of <a href="http://www.cs.cmu.edu/~zhuyund/papers/WSDM_2018_Dai.pdf">Convolutional neural networks for soft-matching n-grams in ad-hoc search</a>
189189
- [ESIM](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/esim.py): this model is an implementation of <a href="https://arxiv.org/abs/1609.06038">Enhanced LSTM for Natural Language Inference</a>
190190
- [BiMPM](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/bimpm.py): this model is an implementation of <a href="https://arxiv.org/abs/1702.03814">Bilateral Multi-Perspective Matching for Natural Language Sentences</a>
191-
192-
- Models under development: <a href="https://arxiv.org/abs/1602.06359">MatchPyramid</a>, <a href="https://arxiv.org/abs/1604.04378">Match-SRNN</a>, <a href="https://arxiv.org/abs/1710.05649">DeepRank</a>, <a href="https://arxiv.org/abs/1801.01641">aNMM</a> ....
191+
- [MatchPyramid](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/match_pyramid.py): this model is an implementation of <a href="https://arxiv.org/abs/1602.06359">Text Matching as Image Recognition</a>
192+
- [Match-SRNN](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/match_srnn.py): this model is an implementation of <a href="https://arxiv.org/abs/1604.04378">Match-SRNN: Modeling the Recursive Matching Structure with Spatial RNN</a>
193+
- [aNMM](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/anmm.py): this model is an implementation of <a href="https://arxiv.org/abs/1801.01641">aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model</a>
194+
- [MV-LSTM](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/mvlstm.py): this model is an implementation of <a href="https://arxiv.org/pdf/1511.08277.pdf">A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations</a>
195+
- [DIIN](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/diin.py): this model is an implementation of <a href="https://arxiv.org/pdf/1709.04348.pdf">Natural Lanuguage Inference Over Interaction Space</a>
196+
- [HBMP](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/hbmp.py): this model is an implementation of <a href="https://arxiv.org/pdf/1808.08762.pdf">Sentence Embeddings in NLI with Iterative Refinement Encoders</a>
197+
- [BERT](https://github.com/NTMC-Community/MatchZoo-py/tree/master/matchzoo/models/bert.py): this model is an implementation of <a href="https://arxiv.org/abs/1810.04805">BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding</a>
193198

194199

195200
## Citation
@@ -251,6 +256,20 @@ If you use MatchZoo in your research, please use the following BibTex entry.
251256
<p>Dev<br>
252257
PhD. ICT</p>​
253258
</td>
259+
</tr>
260+
<tr align="center">
261+
<td>
262+
​ <a href="https://github.com/ChrisRBXiong"><img width="40" height="40" src="https://github.com/ChrisRBXiong.png?s=40" alt="ChrisRBXiong"></a><br>
263+
​ <a href="https://github.com/ChrisRBXiong">Ruibin Xiong</a> ​
264+
<p>Dev<br>
265+
M.S. ICT</p>​
266+
</td>
267+
<td>
268+
​ <a href="https://github.com/dyuyang"><img width="40" height="40" src="https://github.com/dyuyang.png?s=40" alt="dyuyang"></a><br>
269+
​ <a href="https://github.com/dyuyang">Yuyang Ding</a> ​
270+
<p>Dev<br>
271+
M.S. ICT</p>​
272+
</td>
254273
<td>
255274
​ <a href="https://github.com/rgtjf"><img width="40" height="40" src="https://github.com/rgtjf.png?s=36" alt="rgtjf"></a><br>
256275
​ <a href="https://github.com/rgtjf">Junfeng Tian</a> ​

0 commit comments

Comments
 (0)