@@ -83,35 +83,259 @@ pip install sparsezoo
83
83
84
84
## Quick Tour
85
85
86
- ### Python APIs
86
+ The SparseZoo Python API enables you to search and download sparsified models. Code examples are given below.
87
+ We encourage users to load SparseZoo models by copying a stub directly from a [ model page] ( (https://sparsezoo.neuralmagic.com/) ) .
87
88
88
- The Python APIs respect this format enabling you to search and download models. Some code examples are given below.
89
- The [ SparseZoo UI] ( https://sparsezoo.neuralmagic.com/ ) also enables users to load models by copying
90
- a stub directly from a model page.
89
+ ### Introduction to Model Class Object
91
90
91
+ The ` Model ` is a fundamental object that serves as a main interface with the SparseZoo library.
92
+ It represents a SparseZoo model, together with all its directories and files.
92
93
93
- #### Loading from a Stub
94
+ #### Creating a Model Class Object From SparseZoo Stub
95
+ ``` python
96
+ from sparsezoo import Model
97
+
98
+ stub = " zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
99
+
100
+ model = Model(stub)
101
+ print (str (model))
102
+
103
+ >> Model(stub = zoo:cv/ classification/ resnet_v1- 50 / pytorch/ sparseml/ imagenet/ pruned95_quant- none)
104
+ ```
105
+
106
+ #### Creating a Model Class Object From Local Model Directory
107
+ ``` python
108
+ from sparsezoo import Model
109
+
110
+ directory = " .../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0"
111
+
112
+ model = Model(directory)
113
+ print (str (model))
114
+
115
+ >> Model(directory = ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 )
116
+ ```
117
+
118
+ #### Manually Specifying the Model Download Path
119
+
120
+ Unless specified otherwise, the model created from the SparseZoo stub is saved to the local sparsezoo cache directory.
121
+ This can be overridden by passing the optional ` download_path ` argument to the constructor:
122
+
123
+ ``` python
124
+ from sparsezoo import Model
125
+
126
+ stub = " zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
127
+ download_directory = " ./model_download_directory"
128
+
129
+ model = Model(stub, download_path = download_directory)
130
+ ```
131
+ #### Downloading the Model Files
132
+ Once the model is initialized from a stub, it may be downloaded either by calling the ` download() ` method or by invoking a ` path ` property. Both pathways are universal for all the files in SparseZoo. Invoking the ` path ` property will always trigger file download unless the file has already been downloaded.
133
+
134
+ ``` python
135
+ # method 1
136
+ model.download()
137
+
138
+ # method 2
139
+ model_path = model.path
140
+ ```
141
+
142
+ #### Inspecting the Contents of the SparseZoo Model
143
+
144
+ We call the ` available_files ` method to inspect which files are present in the SparseZoo model. Then, we select a file by calling the appropriate attribute:
145
+
146
+ ``` python
147
+ model.available_files
148
+
149
+ >> {' training' : Directory(name = training),
150
+ >> ' deployment' : Directory(name = deployment),
151
+ >> ' sample_inputs' : Directory(name = sample_inputs.tar.gz),
152
+ >> ' sample_outputs' : {' framework' : Directory(name = sample_outputs.tar.gz)},
153
+ >> ' sample_labels' : Directory(name = sample_labels.tar.gz),
154
+ >> ' model_card' : File(name = model.md),
155
+ >> ' recipes' : Directory(name = recipe),
156
+ >> ' onnx_model' : File(name = model.onnx)}
157
+ ```
158
+ Then, we might take a closer look at the contents of the SparseZoo model:
159
+ ``` python
160
+ model_card = model.model_card
161
+ print (model_card)
162
+
163
+ >> File(name = model.md)
164
+ ```
165
+ ``` python
166
+ model_card_path = model.model_card.path
167
+ print (model_card_path)
168
+
169
+ >> ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 / model.md
170
+ ```
171
+
172
+
173
+ ### Model, Directory, and File
174
+
175
+ In general, every file in the SparseZoo model shares a set of attributes: ` name ` , ` path ` , ` URL ` , and ` parent ` :
176
+ - ` name ` serves as an identifier of the file/directory
177
+ - ` path ` points to the location of the file/directory
178
+ - ` URL ` specifies the server address of the file/directory in question
179
+ - ` parent ` points to the location of the parent directory of the file/directory in question
180
+
181
+ A directory is a unique type of file that contains other files. For that reason, it has an additional ` files ` attribute.
182
+
183
+ ``` python
184
+ print (model.onnx_model)
185
+
186
+ >> File(name = model.onnx)
187
+
188
+ print (f " File name: { model.onnx_model.name} \n "
189
+ f " File path: { model.onnx_model.path} \n "
190
+ f " File URL: { model.onnx_model.url} \n "
191
+ f " Parent directory: { model.onnx_model.parent_directory} " )
192
+
193
+ >> File name: model.onnx
194
+ >> File path: ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 / model.onnx
195
+ >> File URL : https:// models.neuralmagic.com/ cv- classification/ ...
196
+ >> Parent directory: ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0
197
+ ```
198
+
199
+ ``` python
200
+ print (model.recipes)
201
+
202
+ >> Directory(name = recipe)
203
+
204
+ print (f " File name: { model.recipes.name} \n "
205
+ f " Contains: { [file .name for file in model.recipes.files]} \n "
206
+ f " File path: { model.recipes.path} \n "
207
+ f " File URL: { model.recipes.url} \n "
208
+ f " Parent directory: { model.recipes.parent_directory} " )
209
+
210
+ >> File name: recipe
211
+ >> Contains: [' recipe_original.md' , ' recipe_transfer-classification.md' ]
212
+ >> File path: / home/ user/ .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 / recipe
213
+ >> File URL : None
214
+ >> Parent directory: / home/ user/ .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0
215
+ ```
216
+
217
+ ### Selecting Checkpoint-Specific Data
218
+
219
+ A SparseZoo model may contain several checkpoints. The model may contain a checkpoint that had been saved before the model was quantized - that checkpoint would be used for transfer learning. Another checkpoint might have been saved after the quantization step - that one is usually directly used for inference.
220
+
221
+ The recipes may also vary depending on the use case. We may want to access a recipe that was used to sparsify the dense model (` recipe_original ` ) or the one that enables us to sparse transfer learn from the already sparsified model (` recipe_transfer ` ).
222
+
223
+ There are two ways to access those specific files.
224
+
225
+ #### Accessing Recipes (Through Python API)
226
+ ``` python
227
+ available_recipes = model.recipes.available
228
+ print (available_recipes)
229
+
230
+ >> [' original' , ' transfer-classification' ]
231
+
232
+ transfer_recipe = model.recipes[" transfer-classification" ]
233
+ print (transfer_recipe)
234
+
235
+ >> File(name = recipe_transfer- classification.md)
236
+
237
+ original_recipe = model.recipes.default # recipe defaults to `original`
238
+ original_recipe_path = original_recipe.path # downloads the recipe and returns its path
239
+ print (original_recipe_path)
240
+
241
+ >> ... / .cache/ sparsezoo/ eb977dae- 2454 - 471b - 9870 - 4cf38074acf0 / recipe/ recipe_original.md
242
+ ```
243
+
244
+ #### Accessing Checkpoints (Through Python API)
245
+ In general, we are expecting the following checkpoints to be included in the model:
246
+
247
+ - ` checkpoint_prepruning `
248
+ - ` checkpoint_postpruning `
249
+ - ` checkpoint_preqat `
250
+ - ` checkpoint_postqat `
251
+
252
+ The checkpoint that the model defaults to is the ` preqat ` state (just before the quantization step).
94
253
95
254
``` python
96
255
from sparsezoo import Model
97
256
98
- # copied from https://sparsezoo.neuralmagic.com/
99
- stub = " zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned90_quant-none "
257
+ stub = " zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant_3layers-aggressive_84 "
258
+
100
259
model = Model(stub)
101
- print (model)
260
+ available_checkpoints = model.training.available
261
+ print (available_checkpoints)
262
+
263
+ >> [' preqat' ]
264
+
265
+ preqat_checkpoint = model.training.default # recipe defaults to `preqat`
266
+ preqat_checkpoint_path = preqat_checkpoint.path # downloads the checkpoint and returns its path
267
+ print (preqat_checkpoint_path)
268
+
269
+ >> ... / .cache/ sparsezoo/ 0857c6f2 - 13c1 - 43c9 - 8db8 - 8f89a548dccd / training
270
+
271
+ [print (file .name) for file in preqat_checkpoint.files]
272
+
273
+ >> vocab.txt
274
+ >> special_tokens_map.json
275
+ >> pytorch_model.bin
276
+ >> config.json
277
+ >> training_args.bin
278
+ >> tokenizer_config.json
279
+ >> trainer_state.json
280
+ >> tokenizer.json
102
281
```
103
282
104
- #### Searching the Zoo
283
+
284
+ #### Accessing Recipes (Through Stub String Arguments)
285
+
286
+ You can also directly request a specific recipe/checkpoint type by appending the appropriate URL query arguments to the stub:
287
+ ``` python
288
+ from sparsezoo import Model
289
+
290
+ stub = " zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none?recipe=transfer"
291
+
292
+ model = Model(stub)
293
+
294
+ # Inspect which files are present.
295
+ # Note that the available recipes are restricted
296
+ # according to the specified URL query arguments
297
+ print (model.recipes.available)
298
+
299
+ >> [' transfer-classification' ]
300
+
301
+ transfer_recipe = model.recipes.default # Now the recipes default to the one selected by the stub string arguments
302
+ print (transfer_recipe)
303
+
304
+ >> File(name = recipe_transfer- classification.md)
305
+ ```
306
+
307
+ ### Accessing Sample Data
308
+
309
+ The user may easily request a sample batch of data that represents the inputs and outputs of the model.
310
+
311
+ ``` python
312
+ sample_data = model.sample_batch(batch_size = 10 )
313
+
314
+ print (sample_data[' sample_inputs' ][0 ].shape)
315
+ >> (10 , 3 , 224 , 224 ) # (batch_size, num_channels, image_dim, image_dim)
316
+
317
+ print (sample_data[' sample_outputs' ][0 ].shape)
318
+ >> (10 , 1000 ) # (batch_size, num_classes)
319
+ ```
320
+
321
+ ### Model Search
322
+ The function ` search_models ` enables the user to quickly filter the contents of SparseZoo repository to find the stubs of interest:
105
323
106
324
``` python
107
325
from sparsezoo import search_models
108
326
109
- models = search_models(
110
- domain = " cv" ,
111
- sub_domain = " classification" ,
112
- return_stubs = True ,
113
- )
114
- print (models)
327
+ args = {
328
+ " domain" : " cv" ,
329
+ " sub_domain" : " segmentation" ,
330
+ " architecture" : " yolact" ,
331
+ }
332
+
333
+ models = search_models(** args)
334
+ [print (model) for model in models]
335
+
336
+ >> Model(stub = zoo:cv/ segmentation/ yolact- darknet53/ pytorch/ dbolya/ coco/ pruned82_quant- none)
337
+ >> Model(stub = zoo:cv/ segmentation/ yolact- darknet53/ pytorch/ dbolya/ coco/ pruned90- none)
338
+ >> Model(stub = zoo:cv/ segmentation/ yolact- darknet53/ pytorch/ dbolya/ coco/ base- none)
115
339
```
116
340
117
341
### Environmental Variables
0 commit comments