FAQs

My function is stuck in the deploying state. What do I do?

Depending on the size of your containers and models, it usually takes up to 30 minutes for your function to deploy, although durations up to 2 hours are permitted. If you believe your function should have deployed already, or if it has entered an error state, review the logs to understand what happened.

See below for information on how to view and use logs to troubleshoot issues with your functions.

There may not be enough capacity available to fulfill your deployment. Try reducing the number of instances you are requesting or changing the GPU/instance type used by your function.

How do I view the logs for my functions?

This is coming soon.

I’m getting errors when invoking my function. What do I do?

Please review the logs and update your container or model as required.

Models

Why am I encountering a “model file not found” error when launching an inference container?

This error typically occurs when the inference container expects a model file in a specified location, but the file is not present. Ensure the path for your model files is correct and the necessary files, like config.json, are available at that location.

What does the log entry “No MODELS environment variable defined” mean?

This message means that the system did not find an environment variable named MODELS during the worker pod’s initialization. If your setup requires this environment variable, ensure it’s defined in your function’s configuration.

Are model files expected to be compressed, and why does the system attempt to extract a model?

Previously, models were compressed, and the system needed to extract them before use. This is no longer necessary, but the log entry indicating model extraction persists from this legacy setup. Your model files can now be in a standard file structure without compression.

Please note that when uploading containers to NGC, the maximum size allowed per layer is 10GB. Additionally, the maximum size for uploading a model is capped at 5TB.

Invalid credentials: Client ID and secret are invalid to fetch token.
Invalid Token: May be expired or some other unknown issue.
Invalid Json input: The JSON provided is not formatted correctly.
HTTP standard errors: Generic HTTP errors that occur during communication.
cuOpt lib errors: Errors such as 419 or others that occur due to malformed JSON input.
Service lib errors: When the service solver fails to find a solution. This could be a generic error, a segmentation fault, or simply an infeasible problem.
S3 errors: Issues with uploading to or pulling data from S3.

Terminology

See Terminology section.

My function is stuck in the deploying state. What do I do?

How do I view the logs for my functions?

I’m getting errors when invoking my function. What do I do?

Models

Why am I encountering a “model file not found” error when launching an inference container?

What does the log entry “No MODELS environment variable defined” mean?

Are model files expected to be compressed, and why does the system attempt to extract a model?

Where should I expect the config.json file to be located in my setup?

What does it mean when a model is being “copied,” and does it affect my deployed models?

Errors

Errors to handle in Thin Client

Terminology