Skip to content

Commit 3db34b2

Browse files
committed
Precompute upper bound for number of files to scan
Instead of calling the CodeBase iterator directly, finder now splits the scan into two steps: 1. Build a list of all the files that might be in the code base. 2. Determine which files are in the code base. The first step is a lot quicker than the second one, and tqdm can use the length of the list to create a progress bar. Signed-off-by: John Pennycook <john.pennycook@intel.com>
1 parent c492f79 commit 3db34b2

File tree

1 file changed

+19
-3
lines changed

1 file changed

+19
-3
lines changed

codebasin/finder.py

+19-3
Original file line numberDiff line numberDiff line change
@@ -163,16 +163,32 @@ def find(
163163
if isinstance(rootdir, Path):
164164
rootdir = str(rootdir)
165165

166-
# Identify all of the files in the codebase.
166+
# Build up a list of potential source files.
167+
def _potential_file_generator(codebase):
168+
for directory in codebase._directories:
169+
yield from Path(directory).rglob("*")
170+
171+
potential_files = []
172+
for f in tqdm(
173+
_potential_file_generator(codebase),
174+
desc="Scanning current directory",
175+
unit=" files",
176+
leave=False,
177+
disable=not show_progress,
178+
):
179+
potential_files.append(f)
180+
181+
# Identify which files are in the code base.
167182
filenames = set()
168183
for f in tqdm(
169-
codebase,
184+
potential_files,
170185
desc="Identifying source files",
171186
unit=" files",
172187
leave=False,
173188
disable=not show_progress,
174189
):
175-
filenames.add(f)
190+
if f in codebase:
191+
filenames.add(f)
176192
for p in configuration:
177193
for e in configuration[p]:
178194
filenames.add(e["file"])

0 commit comments

Comments
 (0)