Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not initialize NNPACK! Reason: Unsupported hardware #98

Open
cdcseacave opened this issue Jan 6, 2025 · 11 comments
Open

Could not initialize NNPACK! Reason: Unsupported hardware #98

cdcseacave opened this issue Jan 6, 2025 · 11 comments

Comments

@cdcseacave
Copy link

Not sure if this affects the reconstruction but I see this in the logs:

[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware
@jytime
Copy link
Contributor

jytime commented Jan 15, 2025

Hi I think this will not affect the result if your running can go well

@cdcseacave
Copy link
Author

VGGSFM works for small scenes (the largest I managed to reconstruct had 125 images), but fails all the time when I try the sequential method: it reconstructs fain the first bunch of images, but always fails to reconstruct the second; I tried various image sizes for the first and next batch, always the same behavior; the above message is the closes to an error that I see printed

@jytime
Copy link
Contributor

jytime commented Jan 16, 2025

Usually it would be due to the runner cannot find enough correspondences between the first bunch and the second bunch. Can you share the full log here?

@cdcseacave
Copy link
Author

Poselib is available
Poselib is available
/home/ubuntu/vggsfm/video_demo.py:18: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
  @hydra.main(config_path="cfgs/", config_name="video_demo")
/home/ubuntu/miniconda3/envs/vggsfm_tmp/lib/python3.10/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
Model Config: query_by_midpoint: true
shared_camera: true
camera_type: SIMPLE_RADIAL
max_query_pts: 1024
init_window_size: 128
window_size: 64
joint_BA_interval: 6
grid_based_sample: false
gr_visualize: false
fmat_thres: 4.0
max_reproj_error: 4.0
init_max_reproj_error: 4.0
dense_depth: false
avg_pose: true
save_to_disk: true
SCENE_DIR: data/5F3D017A-8319-413E-B82E-9EC7A99246A3/
resume_ckpt: ckpt/vggsfm_v2_0_0.bin
auto_download_ckpt: true
query_method: aliked
use_poselib: true
shift_point2d_to_original_res: false
make_reproj_video: false
visual_tracks: false
visual_query_points: false
visual_dense_point_cloud: false
query_by_interval: false
concat_extra_points: false
extra_pt_pixel_interval: -1
extra_by_neighbor: -1
MODEL:
  _target_: vggsfm.models.VGGSfM
  TRACK:
    _target_: vggsfm.models.TrackerPredictor
    efficient_corr: false
    COARSE:
      stride: 4
      down_ratio: 2
      FEATURENET:
        _target_: vggsfm.models.BasicEncoder
      PREDICTOR:
        _target_: vggsfm.models.BaseTrackerPredictor
    FINE:
      FEATURENET:
        _target_: vggsfm.models.ShallowEncoder
      PREDICTOR:
        _target_: vggsfm.models.BaseTrackerPredictor
        depth: 4
        corr_levels: 3
        corr_radius: 3
        latent_dim: 32
        hidden_size: 256
        fine: true
        use_spaceatt: false
  CAMERA:
    _target_: vggsfm.models.CameraPredictor
  TRIANGULAE:
    _target_: vggsfm.models.Triangulator

Building VGGSfM
Using cache found in /home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/swiglu_ffn.py:51: UserWarning: xFormers is not available (SwiGLU)
  warnings.warn("xFormers is not available (SwiGLU)")
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/attention.py:33: UserWarning: xFormers is not available (Attention)
  warnings.warn("xFormers is not available (Attention)")
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:40: UserWarning: xFormers is not available (Block)
  warnings.warn("xFormers is not available (Block)")
[2025-01-16 20:22:57,547][dinov2][INFO] - using MLP layer as FFN
vggsfm_v2_0_0.bin: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.13G/1.13G [01:48<00:00, 10.4MB/s]
VGGSfM built successfully
Data size of Sequence: 1
Run Sparse Reconstruction for Scene 5F3D017A-8319-413E-B82E-9EC7A99246A3
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
Predicting tracks with query_index = 63
Predict tracks in chunks to fit in memory
Predicting tracks with query_index = 0
Predict tracks in chunks to fit in memory
Predicting tracks with query_index = 127
Predict tracks in chunks to fit in memory
Processing non visible frames:  [20, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 20
Predict tracks in chunks to fit in memory
Processing non visible frames:  [26, 27, 28, 29, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 26
Predict tracks in chunks to fit in memory
Processing non visible frames:  [33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 33
Predict tracks in chunks to fit in memory
Processing non visible frames:  [40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 40
Predict tracks in chunks to fit in memory
Processing non visible frames:  [48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 67, 68, 69, 70, 71, 72, 73, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 48
Predict tracks in chunks to fit in memory
Processing non visible frames:  [50, 51, 52, 53, 54, 55, 56, 57, 67, 68, 69, 70, 83, 84, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 50
Predict tracks in chunks to fit in memory
Processing non visible frames:  [53, 54, 55, 56, 57, 83, 84, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 53
Predict tracks in chunks to fit in memory
Processing non visible frames:  [83, 84, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 83
Predict tracks in chunks to fit in memory
Processing non visible frames:  [104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 104
Predict tracks in chunks to fit in memory
Processing non visible frames:  [118, 119, 120, 121, 122, 123, 124]
Predicting tracks with query_index = 118
Predict tracks in chunks to fit in memory
Processing non visible frames:  [122, 123, 124]
Predicting tracks with query_index = 122
Predict tracks in chunks to fit in memory
I20250116 20:34:31.258636 135298773215040 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250116 20:34:31.778979 135298773215040 timer.cc:91] Elapsed time: 0.009 [minutes]
Finished init BA
This frame only has inliers: 27
This frame only has inliers: 14
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 3
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
Finished init refine pose
Triangulate tracks in chunks to fit in memory
I20250116 20:34:54.546104 135298773215040 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250116 20:34:58.138814 135298773215040 timer.cc:91] Elapsed time: 0.060 [minutes]
Finished track triangulation and BA
Frame 26 only has 0 geo_vis inliers
Frame 27 only has 0 geo_vis inliers
Frame 28 only has 0 geo_vis inliers
Frame 29 only has 0 geo_vis inliers
Frame 30 only has 0 geo_vis inliers
Frame 31 only has 0 geo_vis inliers
Frame 32 only has 0 geo_vis inliers
Frame 33 only has 0 geo_vis inliers
Frame 34 only has 0 geo_vis inliers
Frame 35 only has 0 geo_vis inliers
Frame 36 only has 0 geo_vis inliers
Frame 37 only has 1 geo_vis inliers
Frame 38 only has 69 geo_vis inliers
Frame 39 only has 67 geo_vis inliers
Frame 40 only has 0 geo_vis inliers
Frame 41 only has 26 geo_vis inliers
Frame 42 only has 37 geo_vis inliers
Frame 43 only has 0 geo_vis inliers
Frame 44 only has 44 geo_vis inliers
Frame 45 only has 66 geo_vis inliers
Frame 46 only has 3 geo_vis inliers
Frame 47 only has 0 geo_vis inliers
Frame 48 only has 0 geo_vis inliers
Frame 49 only has 0 geo_vis inliers
Frame 50 only has 0 geo_vis inliers
Frame 51 only has 0 geo_vis inliers
Frame 52 only has 0 geo_vis inliers
Frame 53 only has 0 geo_vis inliers
Frame 54 only has 0 geo_vis inliers
Frame 55 only has 0 geo_vis inliers
Frame 56 only has 0 geo_vis inliers
Frame 57 only has 0 geo_vis inliers
Frame 58 only has 0 geo_vis inliers
Frame 59 only has 0 geo_vis inliers
Frame 60 only has 58 geo_vis inliers
Frame 61 only has 80 geo_vis inliers
Frame 62 only has 73 geo_vis inliers
Frame 63 only has 41 geo_vis inliers
Frame 64 only has 22 geo_vis inliers
Frame 65 only has 39 geo_vis inliers
Frame 66 only has 65 geo_vis inliers
Frame 67 only has 2 geo_vis inliers
Frame 68 only has 0 geo_vis inliers
Frame 69 only has 0 geo_vis inliers
Frame 70 only has 0 geo_vis inliers
Frame 71 only has 0 geo_vis inliers
Frame 72 only has 52 geo_vis inliers
Frame 73 only has 0 geo_vis inliers
Frame 74 only has 0 geo_vis inliers
Frame 75 only has 14 geo_vis inliers
Frame 76 only has 3 geo_vis inliers
Frame 77 only has 36 geo_vis inliers
Frame 78 only has 58 geo_vis inliers
Frame 79 only has 32 geo_vis inliers
Frame 80 only has 8 geo_vis inliers
Frame 81 only has 0 geo_vis inliers
Frame 82 only has 31 geo_vis inliers
Frame 83 only has 0 geo_vis inliers
Frame 84 only has 11 geo_vis inliers
Frame 85 only has 8 geo_vis inliers
Frame 86 only has 0 geo_vis inliers
Frame 87 only has 58 geo_vis inliers
Frame 88 only has 58 geo_vis inliers
Frame 89 only has 0 geo_vis inliers
Frame 90 only has 0 geo_vis inliers
Frame 91 only has 0 geo_vis inliers
Frame 92 only has 0 geo_vis inliers
Frame 93 only has 0 geo_vis inliers
Frame 94 only has 24 geo_vis inliers
Frame 95 only has 58 geo_vis inliers
Frame 96 only has 0 geo_vis inliers
Frame 97 only has 0 geo_vis inliers
Frame 98 only has 0 geo_vis inliers
Frame 99 only has 0 geo_vis inliers
Frame 100 only has 55 geo_vis inliers
Frame 109 only has 81 geo_vis inliers
Frame 112 only has 85 geo_vis inliers
Frame 115 only has 0 geo_vis inliers
Frame 116 only has 63 geo_vis inliers
Frame 117 only has 0 geo_vis inliers
Frame 118 only has 0 geo_vis inliers
Frame 119 only has 0 geo_vis inliers
Frame 120 only has 0 geo_vis inliers
Frame 121 only has 0 geo_vis inliers
Frame 122 only has 0 geo_vis inliers
Frame 123 only has 0 geo_vis inliers
Frame 124 only has 0 geo_vis inliers
Frame 125 only has 7 geo_vis inliers
Frame 126 only has 7 geo_vis inliers
Frame 127 only has 7 geo_vis inliers
some frames are invalid after BA refinement
Triangulate tracks in chunks to fit in memory
I20250116 20:35:25.053192 135298773215040 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250116 20:35:30.524742 135298773215040 timer.cc:91] Elapsed time: 0.091 [minutes]
Filter 3D points in chunks to fit in memory
Finished robust refine 0
Frame 26 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 26
Frame 27 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 27
Frame 28 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 28
Frame 29 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 29
Frame 30 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 30
Frame 31 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 31
Frame 32 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 32
Frame 33 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 33
Frame 34 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 34
Frame 35 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 35
Frame 36 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 36
Frame 37 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 37
Estimating absolute poses by visible matches for frame 38
Estimating absolute poses by visible matches for frame 39
Frame 40 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 40
Estimating absolute poses by visible matches for frame 41
Estimating absolute poses by visible matches for frame 42
Frame 43 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 43
Estimating absolute poses by visible matches for frame 44
Estimating absolute poses by visible matches for frame 45
Estimating absolute poses by visible matches for frame 46
Frame 47 only has 65 geo_vis inliers
Estimating absolute poses by visible matches for frame 47
Frame 48 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 48
Frame 49 only has 14 geo_vis inliers
Estimating absolute poses by visible matches for frame 49
Frame 50 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 50
Frame 51 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 51
Frame 52 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 52
Frame 53 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 53
Frame 54 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 54
Frame 55 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 55
Frame 56 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 56
Frame 57 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 57
Frame 58 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 58
Frame 59 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 59
Estimating absolute poses by visible matches for frame 60
Estimating absolute poses by visible matches for frame 61
Estimating absolute poses by visible matches for frame 62
Estimating absolute poses by visible matches for frame 63
Estimating absolute poses by visible matches for frame 64
Estimating absolute poses by visible matches for frame 65
Estimating absolute poses by visible matches for frame 66
Frame 67 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 67
Frame 68 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 68
Frame 69 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 69
Frame 70 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 70
Frame 71 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 71
Estimating absolute poses by visible matches for frame 72
Estimating absolute poses by visible matches for frame 73
Estimating absolute poses by visible matches for frame 74
Estimating absolute poses by visible matches for frame 75
Estimating absolute poses by visible matches for frame 76
Estimating absolute poses by visible matches for frame 77
Estimating absolute poses by visible matches for frame 78
Estimating absolute poses by visible matches for frame 79
Estimating absolute poses by visible matches for frame 80
Estimating absolute poses by visible matches for frame 81
Estimating absolute poses by visible matches for frame 82
Frame 83 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 83
Frame 84 only has 62 geo_vis inliers
Estimating absolute poses by visible matches for frame 84
Estimating absolute poses by visible matches for frame 85
Estimating absolute poses by visible matches for frame 86
Estimating absolute poses by visible matches for frame 87
Estimating absolute poses by visible matches for frame 88
Frame 89 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 89
Frame 90 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 90
Frame 91 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 91
Frame 92 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 92
Estimating absolute poses by visible matches for frame 93
Estimating absolute poses by visible matches for frame 94
Estimating absolute poses by visible matches for frame 95
Frame 96 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 96
Frame 97 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 97
Frame 98 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 98
Frame 99 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 99
Estimating absolute poses by visible matches for frame 100
Estimating absolute poses by visible matches for frame 101
Estimating absolute poses by visible matches for frame 102
Estimating absolute poses by visible matches for frame 103
Estimating absolute poses by visible matches for frame 104
Estimating absolute poses by visible matches for frame 105
Estimating absolute poses by visible matches for frame 106
Estimating absolute poses by visible matches for frame 107
Estimating absolute poses by visible matches for frame 108
Estimating absolute poses by visible matches for frame 109
Estimating absolute poses by visible matches for frame 110
Estimating absolute poses by visible matches for frame 111
Estimating absolute poses by visible matches for frame 112
Estimating absolute poses by visible matches for frame 113
Estimating absolute poses by visible matches for frame 114
Frame 115 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 115
Frame 116 only has 1 geo_vis inliers
Estimating absolute poses by visible matches for frame 116
Frame 117 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 117
Frame 118 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 118
Frame 119 only has 2 geo_vis inliers
Estimating absolute poses by visible matches for frame 119
Frame 120 only has 22 geo_vis inliers
Estimating absolute poses by visible matches for frame 120
Frame 121 only has 22 geo_vis inliers
Estimating absolute poses by visible matches for frame 121
Frame 122 only has 20 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 122
Frame 123 only has 20 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 123
Frame 124 only has 2 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 124
Frame 125 only has 18 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 125
Frame 126 only has 18 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 126
Frame 127 only has 18 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 127
some frames are invalid after BA refinement
Triangulate tracks in chunks to fit in memory
I20250116 20:36:56.472634 135298773215040 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250116 20:37:06.462456 135298773215040 timer.cc:91] Elapsed time: 0.166 [minutes]
Filter 3D points in chunks to fit in memory
Finished robust refine 1
Running iterative BA by 1 times
Triangulate tracks in chunks to fit in memory
Filter 3D points in chunks to fit in memory
I20250116 20:37:34.731953 135298773215040 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
W20250116 20:37:46.906341 135298773215040 levenberg_marquardt_strategy.cc:115] Linear solver failure. Failed to compute a step: CHOLMOD warning: Matrix not positive definite.
I20250116 20:37:50.687950 135298773215040 bundle_adjustment.cc:866]
Bundle adjustment report:
    Residuals : 265322
   Parameters : 47245
   Iterations : 101
         Time : 15.8541 [s]
 Initial cost : 0.996746 [px]
   Final cost : 0.778282 [px]
  Termination : No convergence

I20250116 20:37:50.688014 135298773215040 timer.cc:91] Elapsed time: 0.266 [minutes]
Filter 3D points in chunks to fit in memory
Finished iterative BA 0
Processing window from 128 to 192
Predicting tracks with query_index = 0
No valid frame, step back
Moving window failed, trying again. (This should not happen in most cases)
Processing window from 127 to 191
Error executing job with overrides: ['SCENE_DIR=data/5F3D017A-8319-413E-B82E-9EC7A99246A3/']
Traceback (most recent call last):
  File "/home/ubuntu/vggsfm/video_demo.py", line 74, in demo_fn
    predictions = vggsfm_runner.run(
  File "/home/ubuntu/vggsfm/vggsfm/runners/video_runner.py", line 173, in run
    start_idx, end_idx, move_success, _ = self.move_window(
  File "/home/ubuntu/vggsfm/vggsfm/runners/video_runner.py", line 662, in move_window
    pred_cameras = average_camera_prediction(
  File "/home/ubuntu/vggsfm/vggsfm/utils/utils.py", line 61, in average_camera_prediction
    new_order = calculate_index_mappings(
  File "/home/ubuntu/vggsfm/vggsfm/utils/utils.py", line 173, in calculate_index_mappings
    new_order[0] = query_index
IndexError: index 0 is out of bounds for dimension 0 with size 0

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

@jytime
Copy link
Contributor

jytime commented Jan 16, 2025

Hi,

It seems the root is, the model cannot init the reconstruction well. The first bunch (init) looks have very few inliers. I would suggest to reduce the init window size from 128 to 32 or 64, and increase max_query_pts to higher number like 4096

@cdcseacave
Copy link
Author

I did try multiple combinations before, and none helps. Here are the results with the parameters you suggested:

Poselib is available
Poselib is available
/home/ubuntu/vggsfm/video_demo.py:18: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
  @hydra.main(config_path="cfgs/", config_name="video_demo")
/home/ubuntu/miniconda3/envs/vggsfm_tmp/lib/python3.10/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
Model Config: query_by_midpoint: true
shared_camera: true
camera_type: SIMPLE_RADIAL
max_query_pts: 4096
init_window_size: 64
window_size: 32
joint_BA_interval: 6
grid_based_sample: false
model_name: vggsfm_v2_0_0
seed: 0
img_size: 1024
debug: false
center_order: false
mixed_precision: fp16
extract_color: true
filter_invalid_frame: true
comple_nonvis: true
query_frame_num: 3
robust_refine: 2
BA_iters: 1
fine_tracking: true
load_gt: false
viz_visualize: false
gr_visualize: false
fmat_thres: 4.0
max_reproj_error: 4.0
init_max_reproj_error: 4.0
dense_depth: false
avg_pose: true
save_to_disk: true
SCENE_DIR: data/5F3D017A-8319-413E-B82E-9EC7A99246A3/
resume_ckpt: ckpt/vggsfm_v2_0_0.bin
auto_download_ckpt: true
query_method: aliked
use_poselib: true
shift_point2d_to_original_res: false
make_reproj_video: false
visual_tracks: false
visual_query_points: false
visual_dense_point_cloud: false
query_by_interval: false
concat_extra_points: false
extra_pt_pixel_interval: -1
extra_by_neighbor: -1
MODEL:
  _target_: vggsfm.models.VGGSfM
  TRACK:
    _target_: vggsfm.models.TrackerPredictor
    efficient_corr: false
    COARSE:
      stride: 4
      down_ratio: 2
      FEATURENET:
        _target_: vggsfm.models.BasicEncoder
      PREDICTOR:
        _target_: vggsfm.models.BaseTrackerPredictor
    FINE:
      FEATURENET:
        _target_: vggsfm.models.ShallowEncoder
      PREDICTOR:
        _target_: vggsfm.models.BaseTrackerPredictor
        depth: 4
        corr_levels: 3
        corr_radius: 3
        latent_dim: 32
        hidden_size: 256
        fine: true
        use_spaceatt: false
  CAMERA:
    _target_: vggsfm.models.CameraPredictor
  TRIANGULAE:
    _target_: vggsfm.models.Triangulator

Building VGGSfM
Using cache found in /home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/swiglu_ffn.py:51: UserWarning: xFormers is not available (SwiGLU)
  warnings.warn("xFormers is not available (SwiGLU)")
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/attention.py:33: UserWarning: xFormers is not available (Attention)
  warnings.warn("xFormers is not available (Attention)")
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:40: UserWarning: xFormers is not available (Block)
  warnings.warn("xFormers is not available (Block)")
[2025-01-17 10:46:17,337][dinov2][INFO] - using MLP layer as FFN
VGGSfM built successfully
Data size of Sequence: 1
Run Sparse Reconstruction for Scene 5F3D017A-8319-413E-B82E-9EC7A99246A3
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
Predicting tracks with query_index = 31
Predict tracks in chunks to fit in memory
Predicting tracks with query_index = 0
Predict tracks in chunks to fit in memory
Predicting tracks with query_index = 63
Predict tracks in chunks to fit in memory
Processing non visible frames:  [38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 55, 56]
Predicting tracks with query_index = 38
Predict tracks in chunks to fit in memory
Processing non visible frames:  [50, 51, 52, 53, 55, 56]
Predicting tracks with query_index = 50
Predict tracks in chunks to fit in memory
Processing non visible frames:  [53, 55, 56]
Predicting tracks with query_index = 53
Predict tracks in chunks to fit in memory
I20250117 10:54:35.794021 140497061463872 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250117 10:54:37.237563 140497061463872 timer.cc:91] Elapsed time: 0.024 [minutes]
Finished init BA
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 7
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 5
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 31
some frames are invalid after BA refinement
Finished init refine pose
Triangulate tracks in chunks to fit in memory
I20250117 10:54:49.394839 140497061463872 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250117 10:54:51.051548 140497061463872 timer.cc:91] Elapsed time: 0.028 [minutes]
Finished track triangulation and BA
Frame 0 only has 0 geo_vis inliers
Frame 1 only has 0 geo_vis inliers
Frame 2 only has 4 geo_vis inliers
Frame 3 only has 0 geo_vis inliers
Frame 4 only has 0 geo_vis inliers
Frame 5 only has 4 geo_vis inliers
Frame 6 only has 19 geo_vis inliers
Frame 12 only has 13 geo_vis inliers
Frame 13 only has 35 geo_vis inliers
Frame 14 only has 12 geo_vis inliers
Frame 15 only has 0 geo_vis inliers
Frame 16 only has 0 geo_vis inliers
Frame 17 only has 0 geo_vis inliers
Frame 18 only has 0 geo_vis inliers
Frame 19 only has 0 geo_vis inliers
Frame 20 only has 0 geo_vis inliers
Frame 21 only has 0 geo_vis inliers
Frame 22 only has 4 geo_vis inliers
Frame 23 only has 0 geo_vis inliers
Frame 24 only has 0 geo_vis inliers
Frame 25 only has 0 geo_vis inliers
Frame 27 only has 0 geo_vis inliers
Frame 29 only has 0 geo_vis inliers
Frame 30 only has 0 geo_vis inliers
Frame 31 only has 0 geo_vis inliers
Frame 33 only has 0 geo_vis inliers
Frame 34 only has 0 geo_vis inliers
Frame 35 only has 21 geo_vis inliers
Frame 36 only has 21 geo_vis inliers
Frame 37 only has 0 geo_vis inliers
Frame 38 only has 21 geo_vis inliers
Frame 39 only has 0 geo_vis inliers
Frame 40 only has 0 geo_vis inliers
Frame 41 only has 0 geo_vis inliers
Frame 42 only has 0 geo_vis inliers
Frame 43 only has 0 geo_vis inliers
Frame 44 only has 0 geo_vis inliers
Frame 45 only has 0 geo_vis inliers
Frame 46 only has 0 geo_vis inliers
Frame 47 only has 0 geo_vis inliers
Frame 48 only has 0 geo_vis inliers
Frame 49 only has 0 geo_vis inliers
Frame 50 only has 0 geo_vis inliers
Frame 51 only has 0 geo_vis inliers
Frame 52 only has 0 geo_vis inliers
Frame 53 only has 0 geo_vis inliers
Frame 54 only has 0 geo_vis inliers
Frame 55 only has 0 geo_vis inliers
Frame 56 only has 0 geo_vis inliers
Frame 57 only has 0 geo_vis inliers
Frame 58 only has 0 geo_vis inliers
Frame 59 only has 32 geo_vis inliers
Frame 63 only has 80 geo_vis inliers
some frames are invalid after BA refinement
Triangulate tracks in chunks to fit in memory
I20250117 10:55:05.039976 140497061463872 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250117 10:55:09.236311 140497061463872 timer.cc:91] Elapsed time: 0.070 [minutes]
Finished robust refine 0
Frame 1 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 1
Frame 3 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 3
Frame 4 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 4
Frame 14 only has 38 geo_vis inliers
Estimating absolute poses by visible matches for frame 14
Frame 17 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 17
Frame 18 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 18
Frame 19 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 19
Frame 20 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 20
Frame 21 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 21
Frame 24 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 24
Frame 25 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 25
Frame 27 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 27
Frame 29 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 29
Frame 30 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 30
Frame 31 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 31
Frame 33 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 33
Frame 34 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 34
Frame 35 only has 36 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 35
Frame 36 only has 36 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 36
Frame 37 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 37
Frame 38 only has 36 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 38
Frame 39 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 39
Frame 40 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 40
Frame 41 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 41
Frame 42 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 42
Frame 43 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 43
Frame 44 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 44
Frame 45 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 45
Frame 46 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 46
Frame 47 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 47
Frame 48 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 48
Frame 49 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 49
Frame 50 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 50
Frame 51 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 51
Frame 52 only has 47 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 52
Frame 53 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 53
Frame 54 only has 47 geo_vis inliers
Estimating absolute poses by visible matches for frame 54
Frame 55 only has 47 geo_vis inliers
Estimating absolute poses by visible matches for frame 55
Frame 56 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 56
Frame 57 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 57
Frame 58 only has 25 geo_vis inliers
Estimating absolute poses by visible matches for frame 58
Estimating absolute poses by visible matches for frame 59
Estimating absolute poses by visible matches for frame 60
Estimating absolute poses by visible matches for frame 61
Estimating absolute poses by visible matches for frame 62
Estimating absolute poses by visible matches for frame 63
some frames are invalid after BA refinement
Triangulate tracks in chunks to fit in memory
I20250117 10:56:13.296793 140497061463872 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250117 10:56:21.453214 140497061463872 timer.cc:91] Elapsed time: 0.136 [minutes]
Filter 3D points in chunks to fit in memory
Finished robust refine 1
Running iterative BA by 1 times
Triangulate tracks in chunks to fit in memory
Filter 3D points in chunks to fit in memory
I20250117 10:56:36.221243 140497061463872 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250117 10:56:50.300026 140497061463872 bundle_adjustment.cc:866]
Bundle adjustment report:
    Residuals : 201970
   Parameters : 45538
   Iterations : 101
         Time : 13.9965 [s]
 Initial cost : 0.985694 [px]
   Final cost : 0.646358 [px]
  Termination : No convergence

I20250117 10:56:50.300079 140497061463872 timer.cc:91] Elapsed time: 0.235 [minutes]
Filter 3D points in chunks to fit in memory
Finished iterative BA 0
Processing window from 64 to 96
Predicting tracks with query_index = 0
Predict tracks in chunks to fit in memory
Error executing job with overrides: ['SCENE_DIR=data/5F3D017A-8319-413E-B82E-9EC7A99246A3/']
Traceback (most recent call last):
  File "/home/ubuntu/vggsfm/video_demo.py", line 74, in demo_fn
    predictions = vggsfm_runner.run(
  File "/home/ubuntu/vggsfm/vggsfm/runners/video_runner.py", line 164, in run
    start_idx, end_idx, move_success, _ = self.move_window(
  File "/home/ubuntu/vggsfm/vggsfm/runners/video_runner.py", line 699, in move_window
    ) = self.prepare_window_data(
  File "/home/ubuntu/vggsfm/vggsfm/runners/video_runner.py", line 1137, in prepare_window_data
    ) = predict_tracks(
  File "/home/ubuntu/vggsfm/vggsfm/runners/runner.py", line 1159, in predict_tracks
    fine_pred_track, pred_vis, pred_score = predict_tracks_in_chunks(
  File "/home/ubuntu/vggsfm/vggsfm/runners/runner.py", line 1315, in predict_tracks_in_chunks
    fine_pred_track, _, pred_vis, pred_score = track_predictor(
  File "/home/ubuntu/miniconda3/envs/vggsfm_tmp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/miniconda3/envs/vggsfm_tmp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/vggsfm/vggsfm/models/track_predictor.py", line 105, in forward
    fine_pred_track, pred_score = refine_track(
  File "/home/ubuntu/vggsfm/vggsfm/models/track_modules/refine_track.py", line 149, in refine_track
    fine_pred_track_lists, _, _, query_point_feat = fine_tracker(
  File "/home/ubuntu/miniconda3/envs/vggsfm_tmp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/miniconda3/envs/vggsfm_tmp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/vggsfm/vggsfm/models/track_modules/base_track_predictor.py", line 138, in forward
    fcorr_fn.corr(track_feats)
  File "/home/ubuntu/vggsfm/vggsfm/models/track_modules/blocks.py", line 413, in corr
    corrs = torch.matmul(fmap1, fmap2s)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 11.37 GiB. GPU 0 has a total capacty of 39.50 GiB of which 5.43 GiB is free. Including non-PyTorch memory, this process has 34.06 GiB memory in use. Of the allocated memory 21.83 GiB is allocated by PyTorch, and 11.69 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

@jytime
Copy link
Contributor

jytime commented Jan 17, 2025

Hi,

It seems to work better than before but meeting an out-of-memory error. Such an error can be easily solved as introduced in FAQ:

"We may encounter an out-of-memory error when the number of input frames or query points is too high. In v2.0, we address this by splitting the points into several chunks and running the prediction separately. This involves two hardcoded hyperparameters: max_points_num=163840 in predict_tracks and max_tri_points_num=819200 in triangulate_tracks. These values are set for a 32GB GPU. If your GPU has less or more memory, reduce or increase these values ​​accordingly."

If it still does not work well, feel free to share the images to me here or by email. I can have a try on it.

@cdcseacave
Copy link
Author

my GPU is of the same size as in the settings 40GB:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  |   00000000:08:00.0 Off |                    0 |
| N/A   34C    P0             44W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

now i reduced those numbers as you suggested, and i get a different error:

Poselib is available
Poselib is available
/home/ubuntu/vggsfm/video_demo.py:18: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
  @hydra.main(config_path="cfgs/", config_name="video_demo")
/home/ubuntu/miniconda3/envs/vggsfm_tmp/lib/python3.10/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
Model Config: query_by_midpoint: true
shared_camera: true
camera_type: SIMPLE_RADIAL
max_query_pts: 4096
init_window_size: 64
window_size: 32
joint_BA_interval: 6
grid_based_sample: false
model_name: vggsfm_v2_0_0
seed: 0
img_size: 1024
debug: false
center_order: false
mixed_precision: fp16
extract_color: true
filter_invalid_frame: true
comple_nonvis: true
query_frame_num: 3
robust_refine: 2
BA_iters: 1
fine_tracking: true
load_gt: false
viz_visualize: false
gr_visualize: false
fmat_thres: 4.0
max_reproj_error: 4.0
init_max_reproj_error: 4.0
dense_depth: false
avg_pose: true
save_to_disk: true
SCENE_DIR: data/5F3D017A-8319-413E-B82E-9EC7A99246A3/
resume_ckpt: ckpt/vggsfm_v2_0_0.bin
auto_download_ckpt: true
query_method: aliked
use_poselib: true
shift_point2d_to_original_res: false
make_reproj_video: false
visual_tracks: false
visual_query_points: false
visual_dense_point_cloud: false
query_by_interval: false
concat_extra_points: false
extra_pt_pixel_interval: -1
extra_by_neighbor: -1
MODEL:
  _target_: vggsfm.models.VGGSfM
  TRACK:
    _target_: vggsfm.models.TrackerPredictor
    efficient_corr: false
    COARSE:
      stride: 4
      down_ratio: 2
      FEATURENET:
        _target_: vggsfm.models.BasicEncoder
      PREDICTOR:
        _target_: vggsfm.models.BaseTrackerPredictor
    FINE:
      FEATURENET:
        _target_: vggsfm.models.ShallowEncoder
      PREDICTOR:
        _target_: vggsfm.models.BaseTrackerPredictor
        depth: 4
        corr_levels: 3
        corr_radius: 3
        latent_dim: 32
        hidden_size: 256
        fine: true
        use_spaceatt: false
  CAMERA:
    _target_: vggsfm.models.CameraPredictor
  TRIANGULAE:
    _target_: vggsfm.models.Triangulator

Building VGGSfM
Using cache found in /home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/swiglu_ffn.py:51: UserWarning: xFormers is not available (SwiGLU)
  warnings.warn("xFormers is not available (SwiGLU)")
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/attention.py:33: UserWarning: xFormers is not available (Attention)
  warnings.warn("xFormers is not available (Attention)")
/home/ubuntu/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:40: UserWarning: xFormers is not available (Block)
  warnings.warn("xFormers is not available (Block)")
[2025-01-18 02:05:38,890][dinov2][INFO] - using MLP layer as FFN
VGGSfM built successfully
Data size of Sequence: 1
Run Sparse Reconstruction for Scene 5F3D017A-8319-413E-B82E-9EC7A99246A3
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
Predicting tracks with query_index = 31
Predict tracks in chunks to fit in memory
Predicting tracks with query_index = 0
Predict tracks in chunks to fit in memory
Predicting tracks with query_index = 63
Predict tracks in chunks to fit in memory
Processing non visible frames:  [38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 55, 56]
Predicting tracks with query_index = 38
Predict tracks in chunks to fit in memory
Processing non visible frames:  [50, 51, 52, 53, 55, 56]
Predicting tracks with query_index = 50
Predict tracks in chunks to fit in memory
Processing non visible frames:  [53, 55, 56]
Predicting tracks with query_index = 53
Predict tracks in chunks to fit in memory
I20250118 02:13:57.099540 133475071018816 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250118 02:13:58.552048 133475071018816 timer.cc:91] Elapsed time: 0.024 [minutes]
Finished init BA
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 7
This frame only has inliers: 2
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 14
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
This frame only has inliers: 0
some frames are invalid after BA refinement
Finished init refine pose
Triangulate tracks in chunks to fit in memory
I20250118 02:14:11.335748 133475071018816 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250118 02:14:13.761731 133475071018816 timer.cc:91] Elapsed time: 0.040 [minutes]
Finished track triangulation and BA
Frame 1 only has 0 geo_vis inliers
Frame 3 only has 0 geo_vis inliers
Frame 4 only has 0 geo_vis inliers
Frame 5 only has 0 geo_vis inliers
Frame 6 only has 0 geo_vis inliers
Frame 16 only has 0 geo_vis inliers
Frame 17 only has 0 geo_vis inliers
Frame 18 only has 0 geo_vis inliers
Frame 19 only has 0 geo_vis inliers
Frame 20 only has 0 geo_vis inliers
Frame 21 only has 0 geo_vis inliers
Frame 22 only has 0 geo_vis inliers
Frame 23 only has 92 geo_vis inliers
Frame 24 only has 0 geo_vis inliers
Frame 25 only has 0 geo_vis inliers
Frame 27 only has 0 geo_vis inliers
Frame 29 only has 0 geo_vis inliers
Frame 30 only has 0 geo_vis inliers
Frame 31 only has 0 geo_vis inliers
Frame 33 only has 0 geo_vis inliers
Frame 34 only has 0 geo_vis inliers
Frame 35 only has 67 geo_vis inliers
Frame 36 only has 67 geo_vis inliers
Frame 37 only has 0 geo_vis inliers
Frame 38 only has 67 geo_vis inliers
Frame 39 only has 1 geo_vis inliers
Frame 40 only has 0 geo_vis inliers
Frame 41 only has 0 geo_vis inliers
Frame 42 only has 0 geo_vis inliers
Frame 43 only has 0 geo_vis inliers
Frame 44 only has 0 geo_vis inliers
Frame 45 only has 0 geo_vis inliers
Frame 46 only has 1 geo_vis inliers
Frame 47 only has 1 geo_vis inliers
Frame 48 only has 0 geo_vis inliers
Frame 49 only has 0 geo_vis inliers
Frame 50 only has 0 geo_vis inliers
Frame 51 only has 0 geo_vis inliers
Frame 52 only has 1 geo_vis inliers
Frame 53 only has 0 geo_vis inliers
Frame 54 only has 1 geo_vis inliers
Frame 55 only has 1 geo_vis inliers
Frame 56 only has 0 geo_vis inliers
Frame 57 only has 0 geo_vis inliers
Frame 58 only has 0 geo_vis inliers
Frame 59 only has 85 geo_vis inliers
Frame 63 only has 15 geo_vis inliers
Triangulate tracks in chunks to fit in memory
I20250118 02:14:28.266928 133475071018816 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250118 02:14:32.852007 133475071018816 timer.cc:91] Elapsed time: 0.076 [minutes]
Finished robust refine 0
Frame 1 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 1
Frame 3 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 3
Frame 4 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 4
Frame 5 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 5
Frame 6 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 6
Frame 16 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 16
Frame 17 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 17
Frame 18 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 18
Frame 19 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 19
Frame 20 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 20
Frame 21 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 21
Frame 22 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 22
Frame 24 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 24
Frame 25 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 25
Frame 26 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 26
Frame 27 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 27
Frame 28 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 28
Frame 29 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 29
Frame 30 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 30
Frame 31 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 31
Frame 32 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 32
Frame 33 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 33
Frame 34 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 34
Frame 37 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 37
Frame 39 only has 8 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 39
Frame 40 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 40
Frame 41 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 41
Frame 42 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 42
Frame 43 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 43
Frame 44 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 44
Frame 45 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 45
Frame 46 only has 51 geo_vis inliers
Estimating absolute poses by visible matches for frame 46
Frame 47 only has 51 geo_vis inliers
Estimating absolute poses by visible matches for frame 47
Frame 48 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 48
Frame 49 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 49
Frame 50 only has 43 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 50
Frame 51 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 51
Frame 52 only has 48 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 52
Frame 53 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 53
Frame 54 only has 48 geo_vis inliers
Estimating absolute poses by visible matches for frame 54
Frame 55 only has 48 geo_vis inliers
Estimating absolute poses by visible matches for frame 55
Frame 56 only has 0 geo_vis inliers
Warning! Estimating absolute poses by non visible matches for frame 56
Frame 57 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 57
Frame 58 only has 0 geo_vis inliers
Estimating absolute poses by visible matches for frame 58
Estimating absolute poses by visible matches for frame 59
Estimating absolute poses by visible matches for frame 60
Estimating absolute poses by visible matches for frame 61
Estimating absolute poses by visible matches for frame 62
Frame 63 only has 38 geo_vis inliers
Estimating absolute poses by visible matches for frame 63
some frames are invalid after BA refinement
Triangulate tracks in chunks to fit in memory
I20250118 02:18:38.706309 133475071018816 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250118 02:18:48.428418 133475071018816 timer.cc:91] Elapsed time: 0.162 [minutes]
Filter 3D points in chunks to fit in memory
Finished robust refine 1
Running iterative BA by 1 times
Triangulate tracks in chunks to fit in memory
Filter 3D points in chunks to fit in memory
I20250118 02:19:05.536238 133475071018816 misc.cc:198]
==============================================================================
Global bundle adjustment
==============================================================================
I20250118 02:19:24.981187 133475071018816 bundle_adjustment.cc:866]
Bundle adjustment report:
    Residuals : 344866
   Parameters : 63454
   Iterations : 101
         Time : 19.3051 [s]
 Initial cost : 0.761103 [px]
   Final cost : 0.642267 [px]
  Termination : No convergence

I20250118 02:19:24.981328 133475071018816 timer.cc:91] Elapsed time: 0.324 [minutes]
Filter 3D points in chunks to fit in memory
Finished iterative BA 0
Processing window from 64 to 96
Predicting tracks with query_index = 0
Predict tracks in chunks to fit in memory
No valid frame, step back
Moving window failed, trying again. (This should not happen in most cases)
Processing window from 63 to 95
Error executing job with overrides: ['SCENE_DIR=data/5F3D017A-8319-413E-B82E-9EC7A99246A3/']
Traceback (most recent call last):
  File "/home/ubuntu/vggsfm/video_demo.py", line 74, in demo_fn
    predictions = vggsfm_runner.run(
  File "/home/ubuntu/vggsfm/vggsfm/runners/video_runner.py", line 173, in run
    start_idx, end_idx, move_success, _ = self.move_window(
  File "/home/ubuntu/vggsfm/vggsfm/runners/video_runner.py", line 662, in move_window
    pred_cameras = average_camera_prediction(
  File "/home/ubuntu/vggsfm/vggsfm/utils/utils.py", line 61, in average_camera_prediction
    new_order = calculate_index_mappings(
  File "/home/ubuntu/vggsfm/vggsfm/utils/utils.py", line 173, in calculate_index_mappings
    new_order[0] = query_index
IndexError: index 0 is out of bounds for dimension 0 with size 0

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

@jytime
Copy link
Contributor

jytime commented Jan 18, 2025

Good. Now it shows the model has successfully built the first reconstruction but fails for find the second, as shown by:

No valid frame, step back

Can I try to reduce this number from 0.05 to 0.01?

track_vis_thres=0.05,

It will reduce the threshold about if a point is viewed as valid. For debugging purpose, you can even set it as 0.0.

@cdcseacave
Copy link
Author

lowering that the results in OOM in CUDA, and I can not make it work by reducing those numbers, pls find the images attached, maybe it works for you
https://limewire.com/d/04ea045a-410a-41ef-9519-a76e95128d9b#Q0kBm540sXUrS0tYkup6k6uQh9xuOaaN4nskIMx8HMo

@jytime
Copy link
Contributor

jytime commented Jan 20, 2025

hey I see the problem here, it looks like the images are rotated, which will reduce the accuracy and confidence of tracker a lot. I wil rotate them back and do reconstruction when I have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants