skip to Main Content

I am creating an app to train/use Sklearn models. The user should be able to select the various Sklearn models, at which point the arguments to the selected algorithm must be change-able by the user before training/running. To achieve this efficiently for all (future) Sklearn models, I want to retrieve the typehints and default values for the arguments to the methods during runtime and create the UI automatically based on these.

The Sklearn modules do not use type-hints, though VSCode autocomplete does ‘know’ the types and default values (from what I could find Vscode uses the Typeshed but it seems to be difficult to access it during runtime).

In short:
I want to pass sklearn.linear_model.LinearRegression to something and receive:

(*, fit_intercept: bool = True, copy_X: bool = True, n_jobs: Int | None = None, positive: bool = False)

How would I go about receiving this information during runtime?

2

Answers


  1. Chosen as BEST ANSWER

    All Sklearn classes have a private variable called _parameter_constraints which contains a list of contraints for each parameter. In my case, it seems to be easiest to deduce the possible types and UI element from these contraints.

    E.g., for linear regression:

    _parameter_constraints: dict = {
     "fit_intercept": ["boolean"],
     "copy_X": ["boolean"],
     "n_jobs": [None, Integral],
     "positive": ["boolean"],
    }
    

  2. sklearn uses numpy docstrings, so you can use numpydoc and use inspect to get the signature

    import sklearn.linear_model
    from numpydoc.docscrape import ClassDoc
    
    doc = ClassDoc(sklearn.linear_model.LinearRegression)
    sig = inspect.signature(sklearn.linear_model.LinearRegression)
    

    Usage:

    >>> sig
    <Signature (*, fit_intercept=True, copy_X=True, n_jobs=None, positive=False)>
    
    >>> doc['Parameters']
    [Parameter(name='fit_intercept', type='bool, default=True', desc=['Whether to calculate the intercept for this model. If set', 'to False, no intercept will be used in calculations', '(i.e. data is expected to be centered).']), 
     Parameter(name='copy_X', type='bool, default=True', desc=['If True, X will be copied; else, it may be overwritten.']),
     Parameter(name='n_jobs', type='int, default=None', desc=['The number of jobs to use for the computation. This will only provide', 'speedup in case of sufficiently large problems, that is if firstly', '`n_targets > 1` and secondly `X` is sparse or if `positive` is set', 'to `True`. ``None`` means 1 unless in a', ':obj:`joblib.parallel_backend` context. ``-1`` means using all', 'processors. See :term:`Glossary <n_jobs>` for more details.']),
     Parameter(name='positive', type='bool, default=False', desc=['When set to ``True``, forces the coefficients to be positive. This', 'option is only supported for dense arrays.', '', '.. versionadded:: 0.24'])]
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search