lint: Remove string exclusion from locale check

The string exclusion would fail to detect `"bla" + wrong()`.

Also, remove /* */ comment exclusion, which would fail to detect stuff
like `/* bla */ wrong()`.

Instead, require the function to be called by adding \\( to the regex.

Finally, also remove the section in the dev notes, because:

* It was outdated and missing some functions such as std::to_string in
  the list.
* The maintenance overhead of having to update two places is fragile and
  questionable.
* Many other linters are also not mentioned in the dev notes, even
  though they are important.
* A dev (and CI) is more likely to run the linters than to read the dev
  notes.
* The dev notes are more than 1000 lines of dense information. It would
  be easier to digest if they focused on the important stuff that is not
  checked by automated tools.
This commit is contained in:
MarcoFalke 2025-05-07 10:05:19 +02:00
parent fa2c548429
commit fa24fdcb7f
No known key found for this signature in database
2 changed files with 5 additions and 32 deletions

View File

@ -1013,34 +1013,6 @@ Strings and formatting
- *Rationale*: These functions do overflow checking and avoid pesky locale issues.
- Avoid using locale dependent functions if possible. You can use the provided
[`lint-locale-dependence.py`](/test/lint/lint-locale-dependence.py)
to check for accidental use of locale dependent functions.
- *Rationale*: Unnecessary locale dependence can cause bugs that are very tricky to isolate and fix.
- These functions are known to be locale dependent:
`alphasort`, `asctime`, `asprintf`, `atof`, `atoi`, `atol`, `atoll`, `atoq`,
`btowc`, `ctime`, `dprintf`, `fgetwc`, `fgetws`, `fprintf`, `fputwc`,
`fputws`, `fscanf`, `fwprintf`, `getdate`, `getwc`, `getwchar`, `isalnum`,
`isalpha`, `isblank`, `iscntrl`, `isdigit`, `isgraph`, `islower`, `isprint`,
`ispunct`, `isspace`, `isupper`, `iswalnum`, `iswalpha`, `iswblank`,
`iswcntrl`, `iswctype`, `iswdigit`, `iswgraph`, `iswlower`, `iswprint`,
`iswpunct`, `iswspace`, `iswupper`, `iswxdigit`, `isxdigit`, `mblen`,
`mbrlen`, `mbrtowc`, `mbsinit`, `mbsnrtowcs`, `mbsrtowcs`, `mbstowcs`,
`mbtowc`, `mktime`, `putwc`, `putwchar`, `scanf`, `snprintf`, `sprintf`,
`sscanf`, `stoi`, `stol`, `stoll`, `strcasecmp`, `strcasestr`, `strcoll`,
`strfmon`, `strftime`, `strncasecmp`, `strptime`, `strtod`, `strtof`,
`strtoimax`, `strtol`, `strtold`, `strtoll`, `strtoq`, `strtoul`,
`strtoull`, `strtoumax`, `strtouq`, `strxfrm`, `swprintf`, `tolower`,
`toupper`, `towctrans`, `towlower`, `towupper`, `ungetwc`, `vasprintf`,
`vdprintf`, `versionsort`, `vfprintf`, `vfscanf`, `vfwprintf`, `vprintf`,
`vscanf`, `vsnprintf`, `vsprintf`, `vsscanf`, `vswprintf`, `vwprintf`,
`wcrtomb`, `wcscasecmp`, `wcscoll`, `wcsftime`, `wcsncasecmp`, `wcsnrtombs`,
`wcsrtombs`, `wcstod`, `wcstof`, `wcstoimax`, `wcstol`, `wcstold`,
`wcstoll`, `wcstombs`, `wcstoul`, `wcstoull`, `wcstoumax`, `wcswidth`,
`wcsxfrm`, `wctob`, `wctomb`, `wctrans`, `wctype`, `wcwidth`, `wprintf`
- For `strprintf`, `LogInfo`, `LogDebug`, etc formatting characters don't need size specifiers.
- *Rationale*: Bitcoin Core uses tinyformat, which is type safe. Leave them out to avoid confusion.

View File

@ -1,5 +1,5 @@
#!/usr/bin/env python3
# Copyright (c) 2018-2022 The Bitcoin Core developers
# Copyright (c) 2018-present The Bitcoin Core developers
# Distributed under the MIT software license, see the accompanying
# file COPYING or http://www.opensource.org/licenses/mit-license.php.
#
@ -43,6 +43,7 @@ from subprocess import check_output, CalledProcessError
KNOWN_VIOLATIONS = [
"src/dbwrapper.cpp:.*vsnprintf",
"src/span.h:.*printf",
"src/test/fuzz/locale.cpp:.*setlocale",
"src/test/util_tests.cpp:.*strtoll",
"src/wallet/bdb.cpp:.*DbEnv::strerror", # False positive
@ -215,7 +216,7 @@ LOCALE_DEPENDENT_FUNCTIONS = [
def find_locale_dependent_function_uses():
regexp_locale_dependent_functions = "|".join(LOCALE_DEPENDENT_FUNCTIONS)
exclude_args = [":(exclude)" + excl for excl in REGEXP_EXTERNAL_DEPENDENCIES_EXCLUSIONS]
git_grep_command = ["git", "grep", "-E", "[^a-zA-Z0-9_\\`'\"<>](" + regexp_locale_dependent_functions + ")(_r|_s)?[^a-zA-Z0-9_\\`'\"<>]", "--", "*.cpp", "*.h"] + exclude_args
git_grep_command = ["git", "grep", "--extended-regexp", "[^a-zA-Z0-9_\\`'\"<>](" + regexp_locale_dependent_functions + ")(_r|_s)?\\(", "--", "*.cpp", "*.h"] + exclude_args
git_grep_output = list()
try:
@ -235,8 +236,8 @@ def main():
for locale_dependent_function in LOCALE_DEPENDENT_FUNCTIONS:
matches = [line for line in git_grep_output
if re.search("[^a-zA-Z0-9_\\`'\"<>]" + locale_dependent_function + "(_r|_s)?[^a-zA-Z0-9_\\`'\"<>]", line)
and not re.search("\\.(c|cpp|h):\\s*(//|\\*|/\\*|\").*" + locale_dependent_function, line)
if re.search("[^a-zA-Z0-9_\\`'\"<>]" + locale_dependent_function + "(_r|_s)?\\(", line)
and not re.search("\\.(c|cpp|h):\\s*//.*" + locale_dependent_function, line)
and not re.search(regexp_ignore_known_violations, line)]
if matches:
print(f"The locale dependent function {locale_dependent_function}(...) appears to be used:")