June 02, 2023·vincent

微前端之import-html-entry

Javascript微前端

最近网络上对于微前端讨论的愈加激烈，qiankun 就是一款由蚂蚁金服推出的比较成熟的微前端框架，基于 single-spa 进行二次开发，用于将 Web 应用由单一的单体应用转变为多个小型前端应用聚合为一的应用。尤其适合遗留项目技术栈难以维护，又需要新的技术来迭代功能。

qiankun一大特点就是将html做为入口文件，规避了JavaScript为了支持缓存而根据文件内容动态生成文件名，造成入口文件无法锁定的问题。将html做为入口文件，其实就是将静态的html做为一个资源列表来使用了，这样也避免了一些潜在的问题。本文的主角就是支持qiankun将html做为入口所依赖的 import-html-entry 库，版本是1.7.3。

importHTML

import-html-entry的默认导出接口，返回值为一个promise对象。接口声明如下。

importHTML(url, opts = {})

参数说明：

url ：需要解析的html模板路径
opts：默认值为一个空对象
传入为函数类型的时候，直接做为fetch 使用
传入为对象类型的时候，对象属性用于解析html模板的，如果没有传入，模块内置了默认属性。

属性	参数	返回值	功能	默认
fetch	`url:string`	promise	用于获取远端的脚本和样式文件内容	浏览器`fetch`，如果浏览器不支持，会报错
getPublicPath	`模板url:string`	`publicPath:string`	用于获取静态资源`publicPath`，将模板中外部资源为相对路径的，转换为绝对路径。	以当前`location.href`为`publicPath`
getDomain	??	string	如果没有提供`getPublicPath`参数，则使用`getDomain`，两者都没有提供的时候，使用默认`getPublicPath`	无
getTemplate	`html模板字符串:string`	`html模板字符串:string`	用于支持使用者在模板解析前，做一次处理	无处理

接口返回promise<pending，resolve参数为一个对象，拥有以下属性。

属性	类型	说明	参数
template	string	被处理后的`html`模板字符串，外联的样式文件被替换为内联样式	-
assetPublicPath	string	静态资源的`baseURL`	-
getExternalScripts	function:promise	将模板中所有`script`标签按照出现的先后顺序，提取出内容，组成一个数组	-
getExternalStyleSheets	function:promise	将模板中所有`link`和`style`标签按照出现的先后顺序，提取出内容，组成一个数组	-
execScripts	function:promise	执行所有的`script`中的代码，并返回为`html`模板入口脚本链接`entry`指向的模块导出对象。	参见下文

1export default function importHTML(url, opts = {}) {
2  let fetch = defaultFetch;
3  let getPublicPath = defaultGetPublicPath;
4  let getTemplate = defaultGetTemplate;
5
6  // compatible with the legacy importHTML api
7  if (typeof opts === 'function') {
8    fetch = opts;
9  } else {
10    fetch = opts.fetch || defaultFetch;
11    getPublicPath = opts.getPublicPath || opts.getDomain || defaultGetPublicPath;
12    getTemplate = opts.getTemplate || defaultGetTemplate;
13  }
14
15  return embedHTMLCache[url] || (embedHTMLCache[url] = fetch(url)
16    .then(response => response.text())
17    .then(html => {
18      const assetPublicPath = getPublicPath(url);
19      const { template, scripts, entry, styles } = processTpl(getTemplate(html), assetPublicPath);
20
21      return getEmbedHTML(template, styles, { fetch }).then(embedHTML => ({
22        template: embedHTML,
23        assetPublicPath,
24        getExternalScripts: () => getExternalScripts(scripts, fetch),
25        getExternalStyleSheets: () => getExternalStyleSheets(styles, fetch),
26        execScripts: (proxy, strictGlobal) => {
27          if (!scripts.length) {
28            return Promise.resolve();
29          }
30          return execScripts(entry, scripts, proxy, { fetch, strictGlobal });
31        }
32      }));
33    }));
34}
35

1~13 行，主要是用来处理传入参数类型及默认值的。
15 行，对解析操作做了缓存处理，如果相同的url已经被处理过，则直接返回处理结果，否则通过fetch去获取模板字符串，并进行后续处理。
20行，processTpl 方法是解析模板的核心函数，后面会具体说，这里主要返回了经过初步处理过的模板字符串template、外部脚本和样式的链接前缀assetPublicPath ，所有外部脚本的src值组成的数组scripts，所有外部样式的href值组成的数组styles，还有上面提到的html模板的入口脚本链接entry ，如果模板中没有被标记为entry的script标签，则会返回最后一个script标签的src值。
22 行，调用getEmbedHTML函数将所有通过外部引入的样式，转换为内联样式。embedHTML 函数的代码比较简单，可以直接去看。
25~31行，这里使用了getExternalScripts 、getExternalStyleSheets 、execScripts 三个函数，一一来看下。

getExternalStyleSheets

1export function getExternalStyleSheets(styles, fetch = defaultFetch) {
2  return Promise.all(styles.map(styleLink => {
3    if (isInlineCode(styleLink)) {
4      // if it is inline style
5      return getInlineCode(styleLink);
6    } else {
7      // external styles
8      return styleCache[styleLink] ||
9          (styleCache[styleLink] = fetch(styleLink).then(response => response.text()));
10    }
11  },
12  ));
13}
14

函数的第一个参数是模板中所有link和style标签组成的数组，第二个参数是用于请求的fetch，函数比较简单，主要是通过对link和style的区分，分别来获取样式的具体内容组成数组，并返回。

后面发现，在解析模板的时候style标签的内容并没有被放入styles中，不知道是不是一个失误，issue准备中^_

getExternalScripts

1export function getExternalScripts(scripts, fetch = defaultFetch) {
2  const fetchScript = scriptUrl => scriptCache[scriptUrl] ||
3    (scriptCache[scriptUrl] = fetch(scriptUrl).then(response => response.text()));
4
5  return Promise.all(scripts.map(script => {
6    if (typeof script === 'string') {
7      if (isInlineCode(script)) {
8        // if it is inline script
9        return getInlineCode(script);
10      } else {
11        // external script
12        return fetchScript(script);
13      }
14    } else {
15      // use idle time to load async script
16      const { src, async } = script;
17      if (async) {
18        return {
19          src,
20          async: true,
21          content: new Promise((resolve, reject) => requestIdleCallback(() => fetchScript(src).then(resolve, reject)))
22        };
23      }
24
25      return fetchScript(src);
26    }
27  },
28  ));
29}
30

函数的第一个参数是模板中所有script标签组成的数组，第二个参数是用于请求的fetch。

3行，主要是包装了一下fetch，提供了缓存的能力
8行，这个判断主要是为了区别处理在importEntry中调用函数的时候，提供的可能是通过对象方式配置的资源，例如scripts可能会是这个样子[{src:"http://xxx.com/static/xx.js",async:true},...] 。

execScripts

这段代码太长，下面的代码中，将和性能测试相关的部分删除掉了，只留下了功能代码。

1export function execScripts(entry, scripts, proxy = window, opts = {}) {
2  const { fetch = defaultFetch, strictGlobal = false } = opts;
3
4  return getExternalScripts(scripts, fetch)
5    .then(scriptsText => {
6
7      const geval = eval;
8
9      function exec(scriptSrc, inlineScript, resolve) {
10
11        if (scriptSrc === entry) {
12          noteGlobalProps(strictGlobal ? proxy : window);
13
14// bind window.proxy to change `this` reference in script
15          geval(getExecutableScript(scriptSrc, inlineScript, proxy, strictGlobal));
16
17          const exports = proxy[getGlobalProp(strictGlobal ? proxy : window)] || {};
18          resolve(exports);
19
20        } else {
21
22          if (typeof inlineScript === 'string') {
23// bind window.proxy to change `this` reference in script
24            geval(getExecutableScript(scriptSrc, inlineScript, proxy, strictGlobal));
25          } else {
26// external script marked with async
27            inlineScript.async && inlineScript?.content
28              .then(downloadedScriptText => geval(getExecutableScript(inlineScript.src, downloadedScriptText, proxy, strictGlobal)))
29              .catch(e => {
30                console.error(`error occurs while executing async script ${ inlineScript.src }`);
31                throw e;
32              });
33          }
34        }
35      }
36
37      function schedule(i, resolvePromise) {
38
39        if (i < scripts.length) {
40          const scriptSrc = scripts[i];
41          const inlineScript = scriptsText[i];
42
43          exec(scriptSrc, inlineScript, resolvePromise);
44// resolve the promise while the last script executed and entry not provided
45          if (!entry && i === scripts.length - 1) {
46            resolvePromise();
47          } else {
48            schedule(i + 1, resolvePromise);
49          }
50        }
51      }
52
53      return new Promise(resolve => schedule(0, resolve));
54    });
55}
56

4行，调用getExternalScripts来获取所有script标签内容组成的数组。
53行，我们先从这里的函数调用开始，这里通过schedule 函数开始从脚本内容数组的第一个开始执行。
37~51行，这段定义了schedule函数，通过代码可以看出，这是一个递归函数，结束条件是数组循环完毕，注意看45行，和模板解析函数一样的逻辑，如果entry不存在，则指定数组的最后一个为脚本入口模块，将执行结果通过放在 Promise 中返回。
exec函数比较简单，主要是对entry和非entry的脚本做了区分，对entry模块的执行结果进行返回，见代码18行。

整个代码逻辑比较简单，主要关注entry的处理即可。

另外，代码中通过间接的方式使用了eval 执行了getExecutableScript 函数处理过的脚本字符串，间接的方式确保了eval中代码执行在全局上下文中，而不会影响局部，如果这块不是很清楚，参见神奇的eval()与new Function()，【译】以 eval() 和 new Function() 执行JavaScript代码，永远不要使用eval

getExecutableScript

这个函数的主要作用，是通过修改脚本字符串，改变脚本执行时候的window/self/this 的指向。

1function getExecutableScript(scriptSrc, scriptText, proxy, strictGlobal) {
2  const sourceUrl = isInlineCode(scriptSrc) ? '' : `//# sourceURL=${scriptSrc}\n`;
3
4  window.proxy = proxy;
5  // TODO 通过 strictGlobal 方式切换切换 with 闭包，待 with 方式坑趟平后再合并
6  return strictGlobal
7    ? `;(function(window, self){with(window){;${scriptText}\n${sourceUrl}}}).bind(window.proxy)(window.proxy, window.proxy);`
8    : `;(function(window, self){;${scriptText}\n${sourceUrl}}).bind(window.proxy)(window.proxy, window.proxy);`;
9}
10

核心代码主要是这里;(function(window, self){;${ scriptText }\n${ sourceUrl }}).bind(window.proxy)(window.proxy, window.proxy);。

拆开来看。

// 声明一个函数
let scriptText = "xxx";
let sourceUrl = "xx";
let fn = function(window, self){
    // 具体脚本内容
};
// 改变函数中 this 的指向
let fnBind = fn.bind(window.proxy);
// 指向函数，并指定参数中 window 和 self
fnBind(window.proxy, window.proxy);

通过这一波操作，给脚本字符串构件了一个简单的执行环境，该环境屏蔽了全局了this 、window和self。但是这里默认传入的依然是window，只是在调用的时候可以通过参数传入。

importEntry

1export function importEntry(entry, opts = {}) {
2// ...
3
4// html entry
5  if (typeof entry === 'string') {
6    return importHTML(entry, { fetch, getPublicPath, getTemplate });
7  }
8
9// config entry
10  if (Array.isArray(entry.scripts) || Array.isArray(entry.styles)) {
11
12    const { scripts = [], styles = [], html = '' } = entry;
13    const setStylePlaceholder2HTML = tpl => styles.reduceRight((html, styleSrc) => `${ genLinkReplaceSymbol(styleSrc) }${ html }`, tpl);
14    const setScriptPlaceholder2HTML = tpl => scripts.reduce((html, scriptSrc) => `${ html }${ genScriptReplaceSymbol(scriptSrc) }`, tpl);
15
16    return getEmbedHTML(getTemplate(setScriptPlaceholder2HTML(setStylePlaceholder2HTML(html))), styles, { fetch }).then(embedHTML => ({
17// 这里处理同 importHTML , 省略
18    },
19  }));
20
21} else {
22  throw new SyntaxError('entry scripts or styles should be array!');
23}
24}
25

第一个参数entry 可以是字符串和对象，类型为字符串的时候与importHTML功能相同。为对象的时候，传入的是脚本和样式的资源列表。如下所示

json

1{
2    html:"http://xxx.com/static/tpl.html",
3   scripts:[
4        {
5            src:"http://xxx.com/static/xx.js",
6            async:true
7        },
8       ...
9   ],
10    styles:[
11        { 
12       href:"http://xxx.com/static/style.css"
13        },
14        ...
15    ]
16}

其他

src/process-tpl.js 模块主要做了一件事，就是对资源进行分类收集并返回，没有什么难懂的地方。
src/utils 主要是提供了一些工具函数，其中getGlobalProp和noteGlobalProps比较有意思，用于根据entry执行前后window上属性的变化，来获取entry的导出结果。这两个函数主要依据的原理是对象属性的顺序是可预测的，传送门解惑

1export function getGlobalProp(global) {
2  let cnt = 0;
3  let lastProp;
4  let hasIframe = false;
5
6  for (const p in global) {
7    if (shouldSkipProperty(global, p)) { continue; }
8
9    // 遍历 iframe，检查 window 上的属性值是否是 iframe，是则跳过后面的 first 和 second 判断
10    for (let i = 0; i < window.frames.length && !hasIframe; i++) {
11      const frame = window.frames[i];
12      if (frame === global[p]) {
13        hasIframe = true;
14        break;
15      }
16    }
17
18    if (!hasIframe && (cnt === 0 && p !== firstGlobalProp || cnt === 1 && p !== secondGlobalProp)) { return p; }
19    cnt++;
20    lastProp = p;
21  }
22
23  if (lastProp !== lastGlobalProp) { return lastProp; }
24}
25
26export function noteGlobalProps(global) {
27// alternatively Object.keys(global).pop()
28// but this may be faster (pending benchmarks)
29  firstGlobalProp = secondGlobalProp = undefined;
30
31  for (const p in global) {
32    if (shouldSkipProperty(global, p)) { continue; }
33    if (!firstGlobalProp) { firstGlobalProp = p; } else if (!secondGlobalProp) { secondGlobalProp = p; }
34    lastGlobalProp = p;
35  }
36
37  return lastGlobalProp;
38}
39

noteGlobalProps用于标记执行entry前window的属性状态，执行entry模块后，会导出结果并挂载到window上。
getGlobalProp 用于检测entry模块执行后window的变化，根据变化找出entry的指向结果并返回。