Skip to main content

module util::LanguageServer

rascal-0.40.17
rascal-lsp-2.21.0-2

Bridges {DSL,PL,Modeling} language features to the language server protocol.

Usage

import util::LanguageServer;

Source code

http://github.com/usethesource/rascal-language-servers/blob/main/rascal-lsp/src/main/rascal/util/LanguageServer.rsc

Dependencies

import util::Reflective;
import analysis::diff::edits::TextEdits;
import IO;
import ParseTree;
import Message;

Description

Using the Register Language function you can connect any parsers, checkers, source-to-source transformers, visualizers, etc. that are made with Rascal, to the Language Server Protocol.

Benefits

  • Turn your language implementation into an interactive IDE at almost zero cost.

data Language

Definition of a language server by its meta-data.

data Language  
= language(PathConfig pcfg, str name, set[str] extensions, str mainModule, str mainFunction)
;

The Register Language function takes this as its parameter to generate and run a fresh language protocol server. Every language server is run in its own Rascal execution environment. The Language data-type defines the parameters of this run-time, such that Register Language can boot and initialize new instances.

  • pcfg sets up search paths for Rascal modules and libraries required to run the language server
  • name is the name of the language
  • extensions are the file extensions that bind this server to editors of files with these extensions.
  • mainModule is the Rascal module to load to start the language server
  • mainFunction is a function of type set[LanguageService] () that produces the implementation of the language server as a set of collaborating Language Services.

Benefits

  • each registered language is run in its own Rascal run-time environment.
  • reloading a language is always done in a fresh environment.
  • instances of Language can be easily serialized and communicated in interactive language engineering environments.

Pitfalls

  • even though Register Language is called in a given run-time environment, the registered language runs in another instance of the JVM and of Rascal.

function language

Language language(PathConfig pcfg, str name, str extension, str mainModule, str mainFunction)

alias Parser

Function profile for parser contributions to a language server.

danger

deprecated: marked for future deletion Used only in deprecated functions.

Tree (str _input, loc _origin)

The parser function takes care of parsing the tree once after every change in the IDE. This parse tree is then used for both syntax highlighting and other language server functions.

Pitfalls

  • use ParseTree::parser instead of writing your own function to ensure syntax highlighting is fast

alias Summarizer

Function profile for summarizer contributions to a language server.

danger

deprecated: marked for future deletion Used only in deprecated functions.

Summary (loc _origin, Tree _input)

alias Focus

A focus provides the currently selected language constructs around the cursor.

list[Tree]

A Focus list starts with the bottom tree, commonly a lexical identifier if the cursor is inside an identifer, and ends with the start non-terminal (the whole tree). Everything in between is a spine of language constructs Parse Tree nodes between the top and the bottom node.

The location of each element in the focus list is around (inclusive) the current cursor selection. This means that:

  • every next element in the list is one of the children of the previous.
  • typically the list starts with a smallest tree and ends with the entire start tree.
  • singleton lists may occur in case the cursor is on a layout or literal element of the top production.
  • the start[X] tree is typically preceded by the X tree.
  • the first tree is a whole lexical tree if the cursor is inside an identifier or constant
  • the first tree is a (context-free) syntax tree if the cursor is on the whitespace in between literals and lexicals.
  • the focus list may be empty in case of top-level ambiguity or parse errors.

The Focus is typically provided to the Language Services below, such that language engineers can provide language-directed tools, which are relevant to the current interest of the user.

Benefits

  • All functions that accept a Focus can use list matching to filter locations of interest.

Pitfalls

  • Functions that use list matching on their Focus parameter must provide a default that returns the empty list or empty set.

alias Outliner

Function profile for outliner contributions to a language server.

danger

deprecated: marked for future deletion Only in use in deprecated functions.

list[DocumentSymbol] (Tree _input)

alias LensDetector

Function profile for lenses contributions to a language server.

danger

deprecated: marked for future deletion Only in use in deprecated functions.

rel[loc src, Command lens] (Tree _input)

alias OrderedLensDetector

Function profile for lenses contributions to a language server.

lrel[loc src, Command lens] (Tree _input)

alias CommandExecutor

Function profile for executor contributions to a language server.

danger

deprecated: marked for future deletion Only in use in deprecated functions.

value (Command _command)

alias InlayHinter

Function profile for inlay contributions to a language server.

danger

deprecated: marked for future deletion Only in use in deprecated functions.

list[InlayHint] (Tree _input)

alias Documenter

set[str] (loc _origin, Tree _fullTree, Tree _lexicalAtCursor)

alias CodeActionContributor

list[CodeAction] (Focus _focus)

alias Definer

Function profile for definer contributions to a language server.

danger

deprecated: marked for future deletion Use Definition instead.

set[loc] (loc _origin, Tree _fullTree, Tree _lexicalAtCursor)

alias Referrer

Function profile for referrer contributions to a language server.

danger

deprecated: marked for future deletion Use References instead.

set[loc] (loc _origin, Tree _fullTree, Tree _lexicalAtCursor)

alias Implementer

Function profile for implementer contributions to a language server.

danger

deprecated: marked for future deletion Use Implementation instead.

set[loc] (loc _origin, Tree _fullTree, Tree _lexicalAtCursor)

data LanguageService

Each kind of service contibutes the implementation of one (or several) IDE features.

data LanguageService  
= parsing(Tree (str _input, loc _origin) parsingService
, bool usesSpecialCaseHighlighting = true)
| analysis(Summary (loc _origin, Tree _input) analysisService
, bool providesDocumentation = true
, bool providesHovers = providesDocumentation
, bool providesDefinitions = true
, bool providesReferences = true
, bool providesImplementations = true)
| build(Summary (loc _origin, Tree _input) buildService
, bool providesDocumentation = true
, bool providesHovers = providesDocumentation
, bool providesDefinitions = true
, bool providesReferences = true
, bool providesImplementations = true)
| documentSymbol(list[DocumentSymbol] (Tree _input) documentSymbolService)
| codeLens (lrel[loc src, Command lens] (Tree _input) codeLensService)
| inlayHint (list[InlayHint] (Tree _input) inlayHintService)
| execution (value (Command _command) executionService)
| hover (set[str] (Focus _focus) hoverService)
| definition (set[loc] (Focus _focus) definitionService)
| references (set[loc] (Focus _focus) referencesService)
| implementation(set[loc] (Focus _focus) implementationService)
| codeAction (list[CodeAction] (Focus _focus) codeActionService)
;

Each LanguageService constructor provides one aspect of definining the language server protocol (LSP). Their names coincide exactly with the services which are documented here.

  • The Parsing service that maps source code strings to a Tree is essential and non-optional. All other other services are optional.
    • By providing a parser which produces annotated parse Trees, editor features such as parse error locations, syntax highlighting and selection assistance are immediately enabled.
    • The Parsing service is activated after every change in an editor document (when a suitable pause has occurred)
    • All downstream services are based on the Tree that is produced here. In particular downstream services make use of the src origin fields that the parser must produce.
    • Parsers can be obtained automatically using the parser or parsers functions, like so parser(#start[Program]). Like this a fast parser is obtained that does not require a global interpreter lock. If you pass in a normal Rascal function, which is fine, the global interpreter lock will make the editor services less responsive.
    • Currently, @category tags are ignored in the following special case:
      • if a parse tree has a syntax non-terminal node n with a category (either declared as part of n, or inherited from an ancestors),
      • and if n has a syntax non-terminal node m as a child,
      • then the category of n is ignored in the subtree rooted at m (regardless of whether a category is declared as part of m). This special case is deprecated and will be removed in a future release. In anticipation of the removal, users that rely on this special case for syntax highlighting can update their grammars and explicitly opt-out of the special case by passing usesSpecialCaseHighlighting = false when registering the Parsing service.
  • The Analysis service indexes a file as a Summary, offering precomputed relations for looking up hover documentation, definition with uses, references to declarations, implementations of types and compiler errors and warnings.
    • Analysis focuses on their own file, but may reuse cached or stored indices from other files.
    • Analysis has to be quick since they run in an interactive editor setting.
    • Analysis may store previous results (in memory) for incremental updates.
    • Analysis is triggered on-demand during typing, in a short typing pause. So you have to provide a reasonable fast function (0.5 seconds is a good target response time).
    • Analysis pushes their result on a local stack; which is efficiently queried by the LSP features on-demand.
  • The build service is similar to an Analysis, but it may perform computation-heavier additional checks or take time generate source code and binary code that makes the code in the editor executable.
    • builds typically run whole-program analyses and compilation steps.
    • builds have side-effects, they store generated code or code indices for future usage by the next build step, or by the next analysis step.
    • builds are triggered on save-file events; they push information to an internal cache.
    • Warning: builds are not triggered when a file changes on disk outside of VS Code; instead, this results in a change event (not a save event), which triggers the Analyzer.
    • If providesDocumentation is false, then the Hover service may be activated. Same for providesDefinitions and providesDocumentation ))
  • the following contributions are on-demand (pull) versions of information also provided by the Analysis and build summaries.
    • you can provide these more lightweight on-demand services instead of the Summary versions.
    • these functions are run synchronously after a user interaction. The run-time of each service corresponds directly to the UX response time.
    • a Hover service is a fast and location specific version of the documentation relation in a Summary.
    • a Definition service is a fast and location specific version of the definitions relation in a Summary.
    • a References service is a fast and location specific version of the references relation in a Summary.
    • an Implementation service is a fast and location specific version of the implementations relation in a Summary.
  • The Document Symbol service maps a source file to a pretty hierarchy for visualization in the "outline" view and "symbol search" features.
  • The Code Lens service discovers places to add "lenses" (little views embedded in the editor on a separate line) and connects commands to execute to each lense
  • The Inlay Hint service discovers plances to add "inlays" (little views embedded in the editor on the same line). Unlike Lenses inlays do not offer command execution.
  • The Execution service executes the commands registered by Lenses and Inlay Hinters.
  • The Actions service discovers places in the editor to add "code actions" (little hints in the margin next to where the action is relevant) and connects Code Actions to execute when the users selects the action from a menu.

Many services receive a Focus parameter. The focus lists the syntactical constructs under the current cursor, from the current leaf all the way up to the root of the tree. This list helps to create functionality that is syntax-directed, and always relevant to the programmer.

To start developing an LSP extension step-by-step:

  1. first write a SyntaxDefinition in Rascal and register it via the Parsing service. Use Register Language from the terminal REPL to test it immediately. Create some example files for your language to play around with.
  2. either make an Analysis service that produces a Summary or start Hover, Definition, References and Implementation lookup services. Each of those four services require the same information that is useful for filling a Summary with an Analysis or a Builder.
  3. the Document Symbol service is next, good for the outline view and also quick search features.
  4. the to add interactive features, optionally Inlay Hint, Code Lens and Code Action can be created to add visible hooks in the UI to trigger your own Code Actions and Commands

Benefits

  • You can create editor services thinking only of your programming language or domain-specific language constructs. All of the communication and (de)serialization and scheduling is taken care of.
  • It is always possible and useful to test your services manually in the REPL. This is the preferred way of testing and debugging language services.
  • Except for the Parsing service, all services are independent of each other. If one fails, or is removed, the others still work.
  • Language services in general can be unit-tested easily by providing example parse trees and testing properties of their output. Write lots of test functions!
  • LanguageServices are editor-independent/IDE-independent via the LSP protocol. In principle they can work with any editor that implements LSP 3.17 or higher.
  • Older Eclipse DSL plugins via the rascal-eclipse plugin are easily ported to LanguageServer.

Pitfalls

  • If one of the services does not type-check in Rascal, or throws an exception at Register Language time, the extension fails completely. Typically the editor produces a parse error on the first line of the code. The failure is printed in the log window of the IDE.
  • Users have expectations with the concepts of References, Definition, Implementation which are based on typical programming language concepts. Since these are all just rel[loc, loc] it can be easy to confound them.
    • References point from declarations sites to use sites
    • Definition points the other way around, from a use to the declaration, but only if a value is associated there explicitly or implicitly.
    • Implementation points from abstract declarations (interfaces, classes, function signatures) to more concrete realizations of those declarations.
  • providesDocumentation is deprecated. Use providesHovers instead.

function parser

Construct a parsing Language Service.

danger

deprecated: marked for future deletion Backward compatible with Parsing.

LanguageService parser(Parser parser)

function lenses

Construct a Code Lens Language Service.

danger

deprecated: marked for future deletion Backward compatible with Code Lens.

LanguageService lenses(LensDetector detector)

Not only translates to the old name of the LanguageService, it also maps the list to an arbitrarily ordered set as it was before.

Benefits

  • If you need your lenses in a stable order in the editor, use the Code Lens constructor instead to provide a function that uses an ordered list.

function actions

Construct a Code Action Language Service.

danger

deprecated: marked for future deletion Backward compatible with Code Action.

LanguageService actions(CodeActionContributor contributor)

function builder

Construct a build Language Service.

danger

deprecated: marked for future deletion Backward compatible with build.

LanguageService builder(Summarizer summarizer)

function outliner

Construct a Document Symbol Language Service.

danger

deprecated: marked for future deletion Backward compatible with Document Symbol.

LanguageService outliner(Outliner outliner)

function inlayHinter

Construct a Inlay Hint Language Service.

danger

deprecated: marked for future deletion Backward compatible with Inlay Hint.

LanguageService inlayHinter(InlayHinter hinter)

function executor

Construct a Execution Language Service.

danger

deprecated: marked for future deletion Backward compatible with Execution.

LanguageService executor(CommandExecutor executor)

function documenter

LanguageService documenter(Documenter d)

function definer

LanguageService definer(Definer d)

function referrer

Registers an old-style Referrer.

danger

deprecated: marked for future deletion This is a backward compatibility layer for the pre-existing Referrer alias.

To replace an old-style Referrer with a new style References service follow this scheme.

set[loc] oldReferrer(loc document, Tree selection, Tree fullTree) {
...
}
// by this scheme:
set[loc] newReferencesService([Tree selection, *Tree _spine, Tree fullTree]) {
loc document = selection@\loc.top;
...
}
default set[loc] newReferencesService(list[Tree] _focus) = {};
```.
LanguageService referrer(Referrer d)

function implementer

Registers an old-style Implementer.

danger

deprecated: marked for future deletion This is a backward compatibility layer for the pre-existing Implementer alias.

To replace an old-style Implementer with a new style Implementation service follow this scheme:

set[loc] oldImplementer(loc document, Tree selection, Tree fullTree) {
...
}
// by this scheme:
set[loc] newImplementationService([Tree selection, *Tree _spine, Tree fullTree]) {
loc document = selection@\loc.top;
...
}
default set[loc] newImplementationService(list[Tree] _focus) = {};

```.
LanguageService implementer(Implementer d)

function summarizer

A summarizer collects information for later use in interactive IDE features.

danger

deprecated: marked for future deletion Please use build or Analysis.

LanguageService summarizer(Summarizer summarizer
, bool providesDocumentation = true
, bool providesHovers = providesDocumentation
, bool providesDefinitions = true
, bool providesReferences = true
, bool providesImplementations = true)

function analyzer

An analyzer collects information for later use in interactive IDE features.

danger

deprecated: marked for future deletion Please use build or Analysis.

LanguageService analyzer(Summarizer summarizer
, bool providesDocumentation = true
, bool providesDefinitions = true
, bool providesReferences = true
, bool providesImplementations = true)

data Summary

A model encodes all IDE-relevant information about a single source file.

data Summary  
= summary(loc src,
rel[loc, Message] messages = {},
rel[loc, str] documentation = {},
rel[loc, str] hovers = documentation,
rel[loc, loc] definitions = {},
rel[loc, loc] references = {},
rel[loc, loc] implementations = {}
)
;
  • src refers to the "compilation unit" or "file" that this model is for.
  • messages collects all the errors, warnings and error messages.
  • documentation is the deprecated name for hovers
  • hovers maps uses of concepts to a documentation message that can be shown as a hover.
  • definitions maps use locations to declaration locations to implement "jump-to-definition".
  • references maps declaration locations to use locations to implement "jump-to-references".
  • implementations maps the declaration of a type/class to its implementations "jump-to-implementations".

data Completion

data Completion  
= completion(str newText, str proposal=newText)
;

data DocumentSymbol

DocumentSymbol encodes a sorted and hierarchical outline of a source file.

data DocumentSymbol  
= symbol(
str name,
DocumentSymbolKind kind,
loc range,
loc selection=range,
str detail="",
list[DocumentSymbol] children=[]
)
;

data DocumentSymbolKind

data DocumentSymbolKind  
= \file()
| \module()
| \namespace()
| \package()
| \class()
| \method()
| \property()
| \field()
| \constructor()
| \enum()
| \interface()
| \function()
| \variable()
| \constant()
| \string()
| \number()
| \boolean()
| \array()
| \object()
| \key()
| \null()
| \enumMember()
| \struct()
| \event()
| \operator()
| \typeParameter()
;

data DocumentSymbolTag

data DocumentSymbolTag  
= \deprecated()
;

data CompletionProposal

data CompletionProposal  
= sourceProposal(str newText, str proposal=newText)
;

data Message

Attach any command to a message for it to be exposed as a quick-fix code action automatically.

data Message (list[CodeAction] fixes = [])

The fixes you provide with a message will be hinted at by a light-bulb in the editor's margin. Every fix listed here will be a menu item in the pop-up menu when the bulb is activated (via short-cut or otherwise).

Note that for a Code Action to be executed, you must either provide edits directly and/or handle a Command and add its execution to the Command Executor contribution function.

Benefits

  • the information required to produce an error message is usually also required for the fix. So this coupling of message with fixes may come in handy.

Pitfalls

  • the code for error messaging may become cluttered with code for fixes. It is advisable to only collect information for the fix and store it in a Command constructor inside the Code Action, or to delegate to a function that produces the right Document Edits immediately.
  • don't forget to extend Command with a new constructor and Command Executor with a new overload to handle that constructor.

data Command

A Command is a parameter to a CommandExecutor function.

data Command (str title="") 
= noop()
;

Commands can be any closed term a() pure value without open variables or function/closure values embedded in it). Add any constructor you need to express the execution parameters of a command.

You write the Command Executor to interpret each kind of Command individually. A ((Command) constructor must have fields or keyword fields that hold the parameters of the to-be-executed command.

Commands are produced for delayed and optional execution by:

  • Lens Detector, where the will be executed if the lens is selected in the editor
  • Code Action Contributor, where they will appear in context-menus for quick-fix and refactoring
  • Message, where they will appear in context-menus on lines with error or warning diagnostics

See also Code Action; a wrapper for Command for fine-tuning UI interactions.

Examples

// here we invent a new command name `showFlowDiagram` which is parametrized by a loc:
data Command = showFlowDiagram(loc src);

// and we have our own evaluator that executes the showFlowDiagram command by starting an interactive view:
value evaluator(showFlowDiagram(loc src)) {
showInteractiveContent(flowDiagram(src));
return true;
}

Pitfalls

  • Sometimes a command must be wrapped in a Code Action to make it effective (see Code Action Contributor and Message )
  • the noop() command will always be ignored.
  • never add first-class functions or closures as a parameter or keyword field to a Command. The Command will be serialized, sent to the LSP client, and then sent back to the LSP server for execution. Functions can not be serialized, so that would lead to run-time errors.

data CodeAction

Code actions encapsulate computed effects on source code like quick-fixes, refactorings or visualizations.

data CodeAction  
= action(
list[DocumentEdit] edits = [],
Command command = noop(),
str title = command.title,
CodeActionKind kind = quickfix()
)
;

Code actions are an intermediate representation of what is about to happen to the source code that is loaded in the IDE, or even in a live editor. They communicate what can (possibly) be done to improve the user's code, who might choose one of the options from a list, or even look at different outcomes ahead-of-time in a preview.

The edits and command parameters are both optional, and can be provided at the same time as well.

If Document Edit[edits] are provided then:

  1. edits can be used for preview of a quick-fix of refactoring
  2. edits are always applied first before any command is executed.
  3. edits can always be undone via the undo command of the IDE

If a Command[command] is provided, then:

  1. The title of the command is shown to the user
  2. The user picks this code action (from a list or pressed "OK" in a dialog)
  3. Any edits (see above) are applied first
  4. The command is executed on the server side via the Command Executor contribution
  5. The effects of commands can be undone if they where Document Edits, but other effects like diagnostics and interactive content have to be cleaned or closed in their own respective fashions.

Benefits

  • CodeActions provide tight integration with the user experience in the IDE. Including sometimes previews, and always the undo stack.
  • CodeActions can be implemented "on the language level", abstracting from UI and scheduling details. See also edits for tools that can produce lists of Document Edits by diffing parse trees or abstract syntax trees.
  • edits are applied on the latest editor content for the current editor; live to the user.
  • applyDocumentsEdits also works on open editor contents for the current editor.
  • The parse tree for the current file is synchronized with the call to a Code Action Contributor such that edits and input are computed in-sync.

Pitfalls

  • applyDocumentsEdits and edits when pointing to other files than the current one, may or may not work on the current editor contents. If you want to be safe it's best to only edit the current file.

data CodeActionKind

Kinds are used to prioritize menu options and choose relevant icons in the UI.

data CodeActionKind  
= empty()
| refactor(RefactorKind refactor = rewrite())
| quickfix()
| source(SourceActionKind source = organizeImports())
;

This is an open data type. The constructor names are used to compute the string values for the LSP by capitalizing and joining parent/children with periods.

Examples

refactor(rewrite()) becomes refactor.rewrite under the hood of the LSP.

data SourceActionKind

Used to prioritize menu options and choose relevant icons in the UI.

data SourceActionKind  
= organizeImports()
| fixAll()
;

This is an open list and can be extended by the language engineer at will. These names should be indicative of what will happen to the source code when the action is chosen.

Pitfalls

  • You as language engineer are responsible for implementing the right action with these names.

data RefactorKind

Used to prioritize menu options and choose relevant icons in the UI.

data RefactorKind  
= extract()
| inline()
| rewrite()
;

This is an open list and can be extended by the language engineer at will. These names should be indicative of what will happen to the source code when the action is chosen.

Pitfalls

  • You as language engineer are responsible for implementing the right action with these names.

data InlayHint

Represents one inlayHint for display in an editor.

data InlayHint  
= hint(
loc position,
str label,
InlayKind kind,
str toolTip = "",
bool atEnd = false
)
;
  • position where the hint should be placed, by default at the beginning of this location, the atEnd can be set to true to change this
  • label text that should be printed in the ide, spaces in front and back of the text are trimmed and turned into subtle spacing to the content around it.
  • kind his either type() or parameter() which influences styling in the editor.
  • toolTip optionally show extra information when hovering over the inlayhint.
  • atEnd instead of appearing at the beginning of the position, appear at the end.

data InlayKind

Style of an inlay.

data InlayKind  
= \type()
| parameter()
;

function registerLanguage

Generates and instantiates a new language server for the given language.

void registerLanguage(Language lang)

We register languages by uploading the meta-data of the implementation to a "lanuage-parametric" language server.

  1. The meta-data is used to instantiate a fresh run-time to execute the main-module.
  2. The extension is registered with the IDE to link to this new runtime.
  3. Each specific extension is mapped to a specific part of the language server protocol.

By registering a language twice, more things can happen:

  • existing contributions are re-loaded and overwritten with the newest version.
  • new contributions to an existing language (Language constructor instance), will be added to the existing LSP server instance. You can use this to load expensive features later or more lazily.
  • errors appear during loading or first execution of the contribution. The specific contribution is then usually aborted and unregistered.

Because registerLanguage has effect in a different OS process, errors and warnings are not printed in the calling execution context. In general look at the "Parametric Rascal Language Server" log tab in the IDE to see what is going on.

However since language contributions are just Rascal functions, it is advised to simply test them first right there in the terminal. Use util::Reflective::getProjectPathConfig for a representative configuration.

function unregisterLanguage

Spins down and removes a previously registered language server.

void unregisterLanguage(Language lang)

function unregisterLanguage

Spins down and removes a previously registered language server.

void unregisterLanguage(str name, set[str] extensions, str mainModule = "", str mainFunction = "")

function unregisterLanguage

Spins down and removes a previously registered language server.

danger

deprecated: marked for future deletion Replaced by the new overload that takes an set of extensions.

void unregisterLanguage(str name, str extension, str mainModule = "", str mainFunction = "")

function computeFocusList

Produce a Focus for a given tree and cursor position.

Focus computeFocusList(Tree input, int line, int column)

This function exists to be able to unit test Language Services that accept a Focus parameter, indepently of using Register Language.

  • line is a 1-based indication of what the current line is
  • column is a 0-based indication of what the current column is.

Benefits

  • test services without spinning up an LSP server or having to run UI tests. Each UI interaction is tested generically for you already.

Pitfalls

  • LSP indexing is different, but those differences are resolved in the implementation of the protocol. On the Rascal side, we see the above. Differences are width of the character encoding for non-ASCII characters, and lines are 0-based, etc.