Resolving of License URLs

Resolving of the content of license texts which are referenced by the URLs given in NormalizedLicense.effectiveNormalizedLicenseUrl and NormalizedLicense.licenseRefUrl is done in the following way:

  • If the content is found as a resource in the classpath under licenses this will be taken. (The Solicitor application might include a set of often used license texts and thus it is not necessary to fetch those via the net.) If the classpath does not contain the content of the URL the next step is taken.

  • If the content is found as a file in subdirectory licenses of the current working directory this is taken. If no such file exists the content is fetched via the net. The result will be written to the file directory, so any content will only be fetched once. (The user might alter the files in that directory to change/correct its content.) A file of length zero indicates that no content could be fetched.

The determined content is available as NormalizedLicense.effectiveNormalizedLicenseContent and NormalizedLicense.licenseRefContent

Encoding of URLs

When creating the resource or filename for given URLs in the above steps the following encoding scheme will be applied to ensure that always a valid name can be created:

  • If the scheme is https it will be replaced with http.

  • All "non-word" characters (i.e. characters outside the set [a-zA-Z_0-9]) are replaced by underscores (“_”).

  • In case that the resulting filename exceeds a length of 250 it will be replaced by a new name concatenated from

    • the first 40 characters of the (too) long filename

    • two underscores

    • a sha256 (hex encoded) of the (too) long filename

    • two underscores

    • the last 40 characters of the (too) long filename

Last updated 2023-11-20 10:37:01 UTC